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TITLE OF THE INVENTION 
POLYMORPHIC DNA MARKERS IN BO VI DAE 

GROS S —REFERENCE TO RELATED APPLICATIONS 

The present invention is a continuation-in-part of 
5 U.S. Ser. No. 642,342, filed January 15, 1991, incorpo- 

rated herein by reference - 

FIELD OF THE INVENTION 

The invention relates to gene mapping, selective 
breeding and genetic identification in domestic animals* 

10 BACKGROUND OF THE INVENTION 

The publications and other materials used herein to 
illuminate the background of the invention, and in 
particular, cases to provide additional details respect- 
ing the practice, are incorporated by reference and for 
15 convenience are numerically referenced in the following 

text and respectively grouped in the appended bibliog- 
raphy. 

Until recently, artificial selection has relied on 
the biometrical evaluation of individual breeding values 

20 from an animal's own performance and from performance of 

its relatives (136) - This biometrical strategy is based 
on relatively simple genetic premises, operating within 
a "black box". Briefly, the majority of economically 
important traits are so-called complex or quantitative 

25 traits, meaning that the phenotype of an animal is 

determined by both environment and a large number of 
genes with individually small, additive effects. The 
proportion of the phenotypic variation observed in a 
given population that is genetic in nature is the 

30 heritability of the trait. Substantial genetic progress 

has been obtained using this approach. One of the 
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povers of this biometrical approach is that it obviates 
the need for any detailed molecular knowledge of the 
underlying genes or Economic Trait Loci. 

However, it is believed that the molecular 

5 identification of these Economic Trait Loci could 

> 

increase genetic response by affecting both time and 
accuracy of selection, through a procedure called Marker 
Assisted Selection (91, 96)* One strategy towards the 
isolation of Economic Trait Loci relies on the use of 

10 DNA Sequence Polymorphisms as genetic markers in linkage 

studies. This approach, paradoxically referred to as 
"Reverse Genetics" (138), will be described in detail in 
this introduction. Moreover, we propose a new concept 
called "Velogenetics", or the combined use of Marker 

15 Assisted Introgression and germ-line manipulations to 

shorten the generation interval of domestic species 
(especially cattle) , which will allow the rapid and 
efficient introgression of mapped Economic Trait Loci 
between genetic backgrounds . 

20 I- DNA SEQUENCE POLYMORPHISM (DSP) 



A. Types of DNA Sequence Polymorphism 

The typical mammalian genome is composed of an 
approximately 3x10 s base pairs long DNA stretch, divided 
over a species-specific number of chromosomes, and 

25 containing all the information required for the proper 

development and functioning of a normal being. Each 
individual has two copies of this message: one paternal 
in origin and one maternal. Although their overall 
architecture and content are virtually identical, the 

30 paternal and maternal DNA sequences exhibit subtle 

"allelic" differences, hereinafter referred to as DNA 
Sequence Polymorphisms or "DSP". The DSP that can be 
recognized in a given population are the molecular basis 
of the genetic component of the observed phenotypic 

35 variance. One can distinguish three types of DSP. 
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1. single Base Pair Polymorphisms 
As their name implies, these DSP are due to single 
base pair differences distinguishing alleles. These can 
be either base pair substitutions - transitions (Purine 
5 to Purine or Pyrimidine to Pyrimidine) and transversions 

(Purine to Pyrimidine and vice versa) -, or the inser- 
tion/deletion of a single base pair. 

The frequency of single base pair polymorphism is 
measured by the nucleotide diversity, *r, or average 

10 heterozygosity per nucleotide site (1). The nucleotide 

diversity has been estimated from Restriction Fragment 
Length Polymorphisms at 0.002 for human (2), and at 
0.0007 in cattle (3,4) . This means that on the average 
a human will be heterozygous for one every 500 nucleo- 

15 tides, and a cow for one every 1,500 nucleotides. 

One type of single base pair polymorphism deserves 
special attention: the CpG to TpG transition. The 
cytosine in the CpG dinucleotide sequence is known to be 
the substrate of an eucaryotic methylase, which will add 

20 a methyl group in position 5 of the pyrimidine ring, if 

the cytosine of the complementary CpG dinucleotide is 
itself methylated. Deamination of a 5-methylcytosine 
generates a thymine, blurring the task of the DNA repair 
machinery which will half of the time resolve the 

25 ensuing mismatch by replacing the original guanine 

instead of the mutated thymine. As a consequence, cyto- 
sines in the CpG doublet exhibit mutation rates at least 
ten times higher than other nucleotides, and hence are 
rich sources of single base pair polymorphisms (4, 5). 

30 2. DNA Sequence Rearr angements 

In this kind of DSP, the difference between allelic 

variants involves DNA sequence rearrangements such as 

the insertion or deletion of a stretch of DNA, DNA 

sequence inversions and duplications. 
35 Although there is a wide spectrum of molecular 

mechanisms susceptible to generate such chromosomal re- 
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arrangements, it is well established that mobile genetic 
elements significantly contribute to this kind of DSP. 

In lover eukaryotes such as Drosophila and yeast, 
rearrangements involving transposable elements account 
5 for a large proportion of new mutations detected in 

these organisms (6) . In the mouse, retrovirus-like 
sequences or retrotransposons have been shown to act as 
insertional mutagens (7-11) , and different strains of 
mice exhibit substantial heterogeneity with respect to 
10 the numbers and chromosomal sites of endogenous pro- 

viruses (12) » Variation in the distribution of endoge- 
nous retroviruses has been demonstrated in poultry as 
well . 

In the human, at least 10% of the genome is known 

15 to be composed of retroposon-like sequences. Evidence 

for a role of these sequences in human genetic varia- 
bility and disease stems from several reports of de novo 
mutations due to these sequences: a mutation in the 
human Low Density Lipoprotein receptor gene giving rise 

20 to familial hypercholesterolemia is caused by a deletion 

brought about by an intrastrand recombination event 
between two Alu sequences (13) ; LI insertions were found 
to inactivate the factor VTII gene in hemophilia A 
patients (14) ; a c-myc rearrangement in a breast carcin- 

25 oma was found to be due to insertion of an LI element 

(15) ; an Alu transposition event has been documented in 
human lung carcinoma cells (16); and an homologous 
recombination between the LTRs of a human retrovirus- 
like element was shown to cause a 5 Kb deletion poly- 

30 morphism. Recently, Wong et al. (17) reported evidence 

of human DNA polymorphism arising through DNA-mediated, 
rather than RNA-mediated, transfer between autosomes as 
well . 



35 



3- Expansion-Contraction Type Polymorphism 
A significant proportion of the eucaryotic genome 
is composed of sequences widely termed "satellite DNA," 
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sharing a common organization: a sequence motif, varying 
in length between one and several thousand nucleotides, 
repeated in a head-to-tail or so-called tandem arrange- 
ment . Depending on the methodology originally used for 
5 their study, i.e. isopycnic centrifugation, pulsed field 

gel electrophoresis, agarose gel electrophoresis or 
polyacrylamide gel electrophoresis, satellite sequences 
were grouped into four size classes: macro-, midi-, 
mini- and micro-satellites. Minisatellites are also 

10 known as Variable Number of Tandem Repeats (VNTRs) (18- 

21) - While macro-satellites seem to be confined to 
heterochromatic regions (22) , mini- and micro-satellites 
have been found scattered throughout the genome with, 
however, clustering of mini-satellites (23-34) . In the 

15 human, minisatellite clusters seems to be particularly 

abundant in preterminal regions (35) • The only midi- 
satellite described as such today, has been mapped to 
the short arm of chromosome 1 (36) . In the human, the 
polydeoxyadenylate tract of Alu repetitive elements are 

20 also caracterized by length variation and are thus an 

abundant source of genetic markers as well (37) . 

The function, if any, of satellite sequences, 
whether macro-, midi-, mini~ or micro-, is essentially 
unknown- An important feature of all satellite sequences 

25 is that the maintenance of their tandemly repeated 

organization is dependent on the concerted evolution of 
the repeats. This concerted evolution is thought to 
result from subsequent rounds of unequal crossing-over 
(or any other mechanisms fitting the "card deck" model 

30 (38)), which are favored by the tandemly repeated 

structure itself. The proposed unequal crossing-over 
mechanism, whether happening between sister chromatids 
or homologous chromosomes, explains the substantial 
degree of length polymorphism, here referred to as 

35 "expansion-contraction polymorphism, characterizing 

those sequences. Moreover, the ensuing shuffling of 
slightly divergent repeat units or Minisatellite Variant 
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Repeats (39) within the satellite generates additional 
internal site polymorphism. These peculiar properties 
of satellite sequences have made them an invaluable 
source of highly informative genetic markers, both in 
5 the human and in domestic species (reviewed in 31) , 

B. Detection of DNA Sequence Polymorphism 

During the last ten years, a multitude of methods 
have been developed for the detection of DSP, Two 
techniques, however , undoubtedly dominate this field; 
10 Southern blot hybridization (40) and the Polymerase 

Chain Reaction or PGR (41) , used either separately or in 
conjunction. A non-exhaustive list is reported here, 
the methods being grouped into four classes. 

1- Restriction Pattern Analysis 

15 DSP may alter the restriction patterns of defined 

chromosomal regions, generating so-called Restriction 
Fragment Length Polymorphisms (RFLP) • Depending on the 
size-range of the explored restriction fragments, one 
will use either agarose gel electrophoresis, pulse-field 

20 gel electrophoresis (30) or polyacrylamide gel 

electrophoresis (42) for intermediate, large or small 
fragments respectively. RFLPs are classically detected 
by Southern blot hybridization. Alternatively, one can 
analyze restriction patterns of defined DNA sequences 

25 amplified by PCR, generating so-called Amplified 

Sequence Polymorphisms (43) • When studying chromosomal 
rearrangements or expansion-contraction type polymorph- 
isms, the use of PCR obviates the need for restriction 
enzyme digestion, the DSP reflecting itself in the size 

30 of the amplified product. 

Because of its simplicity, the detection of RFLPs 
has by far been the most popular approach towards DSP. 
The relative lack of power inherent to the method (only 
20% of a given sequence is amenable to exploration using 

35 the most common restriction enzymes) can be compensated 
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for by focusing on highly polymorphic sequences such as 
CpG dinucleotides (using enzymes such as Taq I and Msp l 
containing CpG in their recognition sequence) or hyper- 
variable minisatellites. The discovery, however, of 
5 microsatellites as a very abundant source of highly 

informative DSP in a broad taxonomic range, easily 
detectable by PGR, is likely to shift the focus towards 
these sequences for future marker development (32-34, 
37, 44). 

10 2. Mismatch Analysis 

Several methods for the detection of DSP are based 
on the study of mismatch analysis* DNA to analyze is 
probed with a sequence corresponding to a defined 
genetic variant. The presence of a different variant in 
15 the target DNA generates a mismatched heteroduplex, 

which can be detected by various means: 

a. Detgptipn of Alteyefl Welting peftavjtor 
A mismatched heteroduplex will differentiate itself 
from the perfectly matched homoduplex by an altered 
melting behavior which can be detected as an all-or- 
none, binary response: positive for the homoduplex, 
negative for the heteroduplex, or in a more graded 
response, allowing to distinguish between different 
heteroduplex variants. 

The classical all-or-none test depends on the use 
of allele-specific oligonucleotides in hybridization 
experiments. By choosing appropriate hybridization and 
washing conditions, the allele-specific oligonucleotide 
will only recognize a perfectly complementary sequence 

(45) . With the advent of PCR, new variants of this 
approach have been described including reverse dot-blot 

(46) , the Amplification Refractory Mutation System (47) 
or allele-specific polymerase chain reaction (48) , and 
Competitive Oligonucleotide Priming (49) . The Ligation 
Amplification Reaction, amplification of specific DNA 



25 



30 
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sequences using sequential rounds of template-dependent 
ligation, can also be considered as a peculiar applica- 
tion of the allele-specific oligonucleotide approach 
(50) . 

5 More discriminating is Denaturing Gradient Gel Elec- 

trophoresis, exploring the pattern of melting behavior 
characterizing each heteroduplex when electrophoresed 
through an increasing gradient of DMA denaturants (51) . 
The sensitivity of this method can also be improved by 
10 pre-amplifying the target sequence by PCR. 



15 



b. ttibonnclease and chemical mismatch detection 
The presence of a mismatch in a heteroduplex makes 
those molecules susceptible to cleavage by various means 
including chemical treatment with either hydroxylamine 
or osmium tetroxide (52) , as well as ribonucleases such 
as RNase A in case of an RNArDNA heteroduplex (51) . 
Electrophoretic analysis of the cleavage products allows 
one to distinguish different genetic variants. Again, 
implementing PCR will increase the sensitivity of the 
20 approach. 

3. single Stranded Conformat ion Polymorphism 
Under nondenaturing conditions, single-stranded DNA 
has a folded conformation that is stabilized by intra- 
strand interactions. Consequently, the conformation, 

25 and therefore the electrophoretic mobility, is dependent 

on the sequence. DNA variants exhibit indeed mobility 
shifts when electrophoresed in such conditions, presum- 
ably resulting from conformational changes caused by 
sequence alterations, hence the name single stranded 

30 conformation polymorphism. Again, the altered mobility 

can be detected by blot hybridization analysis or rely- 
ing on PCR (53, 54) . 
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4. Direct Determination of -the DNA Sequence 
Obviously the most powerful approach towards DSP is 
the direct determination of the DNA sequence* The need 
of a cloning step, however, in classical sequencing pro- 
5 tocols precluded the analysis of large samples. This 

limitation has been circumvented by the development of 
genomic sequencing (55), allowing the direct determina- 
tion of defined DNA sequences from genomic DNA, and more 
recently and less laboriously by the development of 
10 direct sequence determination of PCR amplified products. 

The feasibility of the latter approach for the detection 
of DSP has been amply demonstrated in several independ- 
ent studies (see, for instance, 56). 

C. Origin and Evolution of DNA Sequence Polymorphism 
15 DSP encountered in a given population find their 

origin in mutational events occurring in the germline 
and escaping the DNA repair machinery* The fate of 
these germline mutations in the population is dominated 
by two kinds of effects: stochastic and deterministic 
20 effects. 

!- Stochastic Effects 

When a new mutation appears in the population, its 
initial survival depends largely on chance, regardless 
of its selective effect. This is easily illustrated as 

25 follows. Assume an individual heterozygous for a neo- 

mutation inherited from its parent, in whose germline 
the mutation appeared. If this individual has one, two 
or three offspring, the chances for the neomutation to 
be lost from the population, because transmitted to none 

30 of the offspring, are 0.5, 0.25 and 0.125 respectively. 

Even if inherited by part of the offspring, the same 
"stochastic filter" will operate in the next generation. 
In the course of this random drift, the overwhelming 
majority of mutant alleles are lost by chance. However, 

35 some will see their frequency increase in the 
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population, and despite fluctuations over time, 
eventually become fixed in the population, until 
substituted by the next mutant allele. 

As demonstrated essentially by Kimura (57) in the 
5 framework of his neutral theory of molecular evolution, 

the probability for a selectively neutral neomutation to 
be fixed in a population of N individuals is equal to 
its initial frequency 1/2N, the average time for 
fixation is four times the "effective" population size 
10 or 4Ne, and the rate k of mutant substitution per 

generation is simply equal to the rate of mutation per 
gamete and per generation, /z, independent of what the 
population size may be. 

According to this view, a polymorphism observed in 
15 a population at a given time is composed of "transient" 

alleles catched in their stochastic "odyssey" throughout 
the population. 

Populations for which 4.Ne./t < 1 are essentially 
monomorphic, while populations for which 4*Ne*M w 1 are 
20 characterized by a substantial degree of transient 

polymorphism. The model predicts a steady state level 
of heterozygosity, H: 

4.Ne./i 
H « 

25 4.Ne./i + 1 

2. Deterministic Effects 

There is evidence that the fate of a significant 
proportion of DSP, especially those occurring in non- 
coding parts of the genome (composing the large majority 

30 of the genome) , is essentially dominated by random 

drift. However, when a neo-mutation affects a DNA 
sequence which is expressed at the phenotypic level in 
the broad sense, the mutation may not longer be selec- 
tively neutral, and deterministic effects will be 

35 superimposed on the stochastic ones. Negative and 

positive selection will respectively decrease or 
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increase the probability and rate of fixation, while 
"balancing selection" will maintain specific alleles in 
a population in an equilibrium state. 

a. Negative Selection 
5 When comparing DNA sequences between taxa, it 

appears that the estimated number of mutant substi- 
tutions per nucleotide to account for the observed 
divergence is highest for non-coding sequences, such as 
pseudogenes and intronic sequences, and much lower for 

10 coding sequences. For the latter, however, a difference 

must be made between first, second and third positions 
of the codons. The third position, for which only 28% 
of substitutions are expected to cause an amino acid 
change (versus 95% and 100% for first and second 

15 positions respectively) , exhibits the highest substitu- 

tion rate. When estimating that part of substitutions 
at the third positions which are so-called synonymous, 
rates very similar to non-coding regions are observed 
(57) . Moreover, DSP are more prominent for non-coding 

20 sequences and, within coding sequences, at third codon 

positions (compared to first and second positions (58) ) . 
These observations are easily explained by assuming that 
the fate of neomutations arising in non-coding regions 
or of synonymous neomutations, is dominated by stochas- 

25 tic effects, while the fate of mutations causing amino 

acid replacements will depend as well on whether or not 
they disrupt the function of the protein, in which case 
they will be eliminated from the population by negative, 
"purifying" selection. The higher the functional con- 

30 straints imposed on a protein, the higher the proportion 

of neomutations expected to be harmful and, hence, the 
lower the substitution rate, expressed at the protein 
level as a higher "unit evolutionary time" (average time 
required for one amino acid change to appear in a 

35 sequence of 100 amino acid residues) . 
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These observations have been considered as a strong 
argument in favor of a predominant role for random drift 
in the dynamics of molecular evolution. 

b. Positive selection 
5 According to the previous discussion, the major 

drive behind molecular evolution is non-adaptive in 
nature, which is in conflict with the classical theory 
of adaptive, positive Darwinian selection. 

There is, however, evidence for positive and 

10 adaptive evolution at the molecular level in at least a 

few instances, comparing DNA sequences from members of 
two gene families: the serine protease inhibitors in rat 
(59) and the pregnancy specific 61 glycoprotein gene 
family in man, evidence has been found for higher 

15 substitution rates at first and second codon positions 

than at the third position, in at least some protein 
regions, pointing towards positive selection. 

Moreover, there are a number of experimental data 
suggesting that some allelic differences identified by 

20 electrophoresis are associated with adaptation to 

different environments. In Drosophila, for instance, 
their is evidence for correlation between in vitro heat 
resistance of ADH variants and the temperature charac- 
terizing their geographical origin (58) . 

25 c. Balancing selection 

The evolutionary forces described so far generate 
transient DSP in the sense that the population fre- 
quencies of existing genetic variants will irrevocably 
change with time until either fixation or loss. In some 

30 cases, however, alleles may be maintained in a popula- 

tion at a steady state level . Overdominance is one of 
the mechanisms susceptible to generate such a "balanced 
polymorphism". For a two allele system, this means that 
the heterozygous individuals benefit from a selective 

35 advantage compared to both homozygous genotypes. This 
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is expected to generate a steady state where both 
alleles are maintained in the population at respective 
equilibrium frequencies p and q, where 

s , t 

5 q *= — and p = , 

s + t s + t 

s and t being the respective selection coefficients of 
the homozygotes • 

The best known example of balanced polymorphism due 
to overdominance is the maintenance of the S a-globin 
allele (causing sickle-cell anemia in the homozygotes) 
as well as thalassemia-causing mutants (see, for 
instance, 61) in populations subjected to malaria, 
because of the resistance exhibited by the heterozygotes 
towards the parasite. The high level of polymorphism 
observed at the Major Histocompatibility Locus is 
thought to result from overdominance selection as well 
(62). 

Frequency dependent selection may be another cause 
of balanced polymorphism, ah example being the "rare 
mate advantage" observed in Drosophila (63) . 

II. CONSTRUCTION OF PRIMARY DNA MARKER MAPS 

A. Linkage Strategies 

Two loci are said to be genetically linked if, 
during meiosis, they recombine at significantly lower 
than a 50% rate, i.e., they generate significantly more 
parental gametes than recombinant gametes. The recom- 
bination rate between loci reflects the frequency of 
occurrence of an uneven number of crossing-overs between 
the loci. Because the probability for crossing-over is 
proportional to the distance separating the loci, the 
recombination rate can be used as a unit of chromosomal 
length. This length unit is known as the Morgan (M) , 
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1 cM corresponding to the distance separating two loci 
exhibiting a 1% recombination rate. For small distance 
(<30cM) , the relation between centimorgan and recombin- 
ation rate is essentially linear. For longer distances, 
5 however, the relation is more complex, depending on the 

frequency of double crossing-overs, itself affected by 
eventual interference. 

Parental and recombinant gametes will only be 
distinguishable for doubly heterozygous individuals, 

10 hence the need for highly polymorphic markers. 

Recently, and due to the advent of the PCR, it has 
been possible to directly determine the genotype of 
individual gametes (64) . However, most of the time, the 
gametic contribution is inferred from the genotype of 

15 the offspring and linkage studies are performed within 

families. Most modern linkage studies use the lodscore 
test for evaluation of linkage: a sequential test based 
on the method of maximum likelihood (65) . The lodscore 
corresponds to loglO(LR), where 1R corresponds to the 

20 ratio: likelihood of observations under alternative 

hypothesis 9<0. 5, divided by the likelihood of observ- 
ations under null hypothesis of no linkage, 8=0.5. In 
human genetics, a lodscore > 3 is accepted as signif- 
icant evidence for linkage. The prior probability of 

25 linkage between two loci has been used to justify this 

stringent critical value. Note that 21n(IJR) can be used 
as well, having a chi-square distribution with one 
degree-of-freedom under the null hypothesis of no 
linkage. 

30 Recently, algorithms for multilocus linkage analy- 

sis have been developed, allowing an estimate of the 
most likely gene orders and genetic distances between 
several loci simultaneously (66-68) . 

Although usually determined within families, 

35 genetic linkage can manifest itself at the population 

level also: a phenomenon called "linkage disequilib- 
rium". According to the Hardy-Weinberg law, the 
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equilibrium genotypic frequencies are reached in a 
single generation (except if the initial gene frequen- 
cies are not equal among sexes) . For a dial lei ic system 
with alleles al and a2, with respective allelic frequen- 
5 cies pi and p2, the equilibrium genotypic frequencies 

are pi 2 , 2plp2 and p2 2 for alal, ala2 and a2a2 respec- 
tively. This does not necessarily hold when considering 
two loci simultaneously. The genotypic equilibrium 
frequencies are only reached when the previous genera- 

10 tion produces the four possible gametes at the expected 

frequencies: albl: plql, a2bl: p2ql, albl: p2ql, a2b2: 
p2q2. The difference between observed and expected 
gametic frequencies is called linkage disequilibrium, D. 
The value of D is reduced by d.e every generation, 9 

15 being the recombination rate between the two loci. For 

unlinked loci D diminishes by 1/2 every generation; for 
linked loci, however, the reduction of D per generation 
will be much smaller. The detection of a linkage 
disequilibrium is an indication of linkage between the 

20 corresponding loci. 

B. Genetic Maps 

Using this linkage approach, combined with alterna- 
tive mapping strategies such as "in situ" hybridization 
(see, for instance, 69) , the use of somatic cell hybrid 
panels and radiation hybrid mapping (reviewed in 70) and 
comparative mapping (71) , the map location of a large 
set of DSP can be determined in order to build a genetic 
marker map (see, for instance, 72-74) . Assuming a total 
map length of 30M as for the human, and a desirable 
maximum distance of 20cM between markers, a set of 150 
DSP could cover the entire genome. However, many more 
markers will be needed to generate reasonable maps for 
our domestic species, essentially for two reasons. 
First, most of the time we have no a priori information 
on the location of the characterized markers. Hence, 
some chromosomal regions will initially be 
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overrepresented in our map, others under represented. 
This problem is expected to become critical in the later 
stages of the development of a map. Comparative data 
will then become critical, allowing to search for 
5 markers whose location can be predicted from other 

species. Second, an individual will only be informative 
for the markers for which he is heterozygous; parts of 
his genome will thus not be explorable, because he will 
be homozygous for the corresponding markers. To compen- 

10 sate for this, one will have to identify more markers, 

the number required being inversely proportional to 
their heterozygosity — hence, the importance of highly 
informative systems. 

Once such a map is available, however, any gene for 

15 which the appropriate segregating family material is 

available can be located on the map. Assuming a maximum 
marker- target gene distance of lOcM, the expected 
lodscore for doubly informative, phase-known meioses 
approximates 0.16 (75). Therefore, 20 such meioses are 

20 theoretically sufficient to establish linkage with a 

lodscore of 3. In practice, however > the number of 
individuals to analyze will be higher, a function among 
other factors of the quality of the marker, expressed as 
its Polymorphism Information Content (76) . 

25 The efficiency of this approach has been illus- 

trated by the recent mapping of a large number of genes 
involved in human single gene disorders (see, for 
instance, 77, 135). The identification of DNA markers 
for a defined gene can be the first step towards its 

30 molecular cloning. Successful "positional cloning", or 

the isolation of a gene based on its map location, has 
been achieved in the human for Chronic Granulomatous 
Disease, Duchennes Muscular Dystrophy, Retinoblastoma, 
Wilms Tumor, cystic Fibrosis (134), Type-1 Neurofibroma- 
35 tosis and the Testis Determining Factor. 

In domestic animals, genetic maps could be used to 
localize the genes underlying production traits. 
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allowing for Marker Assisted Selection, and a first step 
towards their isolation, the understanding of their 
mechanism of action and their manipulation by mutagene- 
sis and gene transfer methods. Several laboratories 
5 around the world are now involved in the development of 

markers and the construction of genetic maps for our 
main domestic species, especially cattle, pigs and 
poultry. 



III. GENETIC MAPPING OF QUANTITATIVE TRAIT LOCI 

10 The majority of traits dealt with in animal pro- 

duction are so-called quantitative traits, characterized 
by continuous variation. The phenotype of an animal with 
respect to a particular trait is the result of the 
effect of a several "polygenes" known as Quantitative 

15 Trait Loci, or QTL, combined with environmental effects. 

The number of polygenes involved is essentially unknown. 
Classically, it is considered very large, each gene 
contributing a very small part of the genetic variation. 
However, there is evidence both from the plant world and 

20 the animal world, that QTL with significant effects are 

common (78, 79) . The most likely model is to assume that 
there is indeed a large number of genes involved, but 
that there is a broad distribution of effects, substan- 
tial in some cases. Polygenes with extreme effects, 

25 whose segregation in a population may cause skewness and 

bi- or trimodality, are known as "major genes". Examples 
in animal breeding are "double muscling" genes in both 
cattle and pigs, the "White Shorthorn" gene involved in 
the determinism of "White Heifer Disease" and the 

30 "Booroola" fertility gene in sheep (80) . Even with 

significant effects on the trait of interest, however, 
their contribution to the total genetic variation may be 
limited in case of low population frequency. 

When dealing with quantitative traits, direct 

35 determination of genotype for the corresponding QTL is 
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impossible. Nevertheless, strategies have been designed 
to map QTL by linkage analysis* Within segregating 
populations, which is usually the case for our domestic 
species, QTL mapping can be performed both within 
5 families and at the population level. 

A. QTL Mapping Within Families 

Traditionally one proceeds as follows: offspring 
from an individual heterozygous for both marker and QTL 
are grouped according to which allele at the marker 

10 locus they inherited; a statistically significant 

difference between the phenotypic means of the two 
groups indicates linkage between marker and QTL, Test 
for statistical significance is done by linear regres- 
sion (i.e. one-way analysis of variance) under the 

15 assumption of normally-distributed residual environ- 

mental variance. Classically, markers are tested one at 
a time for possible linkage with a QTL affecting the 
trait of interest. One of the drawbacks of this approach 
is that it is impossible to unequivocally estimate both 

20 map location of the QTL with respect of the marker, and 

its effect on the considered trait; no distinction can 
■ be made between a closely linked QTL with small effect 
and a loosely linked QTL with major effect. 

Recently, the lodscore method has been improved, 

25 making it possible to deal with quantitative and other 

complex traits and fully exploiting the power of the 
nearly complete marker maps which have become available 
for different organisms . This approach is known as 
interval mapping. Not only does interval mapping solve 

30 the problem of simultaneous estimation of location and 

effect, but because of its increased power , it reduces 
the number of individuals to be tested to detect linkage 
with a QTL of given effect (81) . 

Assuming that the marker is the QTL, the number of 

35 individuals to test in order to detect an effect of 

given amplitude, S, can be estimated from: 
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4(t 0 +t 1 ) 2 .s 2 



where n gives the required sample size, s 2 is an estimate 
5 of the residual variance, t D is the t value associated 

with Type I error, and t x is the t value associated with 
Type II error; t x equals tabulated t for probability 2(1- 
P) where P is the required probability of detecting 6 if 
such a difference exists (82) . 

10 For dairy production for instance, and if perform- 

ing the linkage analysis using the "daughter yield 
deviations" (DYD; a 2 DYD « 600 lb) from paternal half-sibs 
("granddaughter design" (83), one would have to study 
1,500, 378 and 168 individuals, respectively, to detect 

15 QTL with differences of 200 lb, 400 lb and 600 lb 

between alternate alleles* Assuming phenotypic variance 
of (2500 lb) 2 , such effects correspond to 0.08, 0.16 and 
0.24 standard deviations, respectively. These estimates 
assume a Type I error of 5%, a Type II error of 10%, and 

20 absence of recombination between marker and QTL. 

If the tested marker and the QTL recombine at a 
rate 9, the number of individuals to test increases by 
a factor 1/(1-26) 2 for single marker analysis, by a 
factor »(l-r)/(l-29) 2 in the case of interval mapping, r 

25 corresponding to the recombination rate between the 

flanking markers (81) . 

In view of the costs and time involved in geno- 
typing, it is important to minimize the required sample 
size. This can be achieved in various ways: 

30 a. Identification of the Individuals Most Likely to 

be Heterozygous, hence Informative , for the Studied OTLs 

One way to achieve this is to cross highly diver- 
gent strains for the trait of interest. In plant 
breeding, where the use of exotic germplasm is common 
35 practice, this is perfectly applicable. The 
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identification of markers for interesting QTLs from the 
exotic strains can then toe used for their marker 
assisted introgression in the commercial varieties* 

In animal breeding, however, introgression programs 
5 are very uncommon. With "Velogenetics" (described 

further below) , however, the use of exotic germplasm in 
introgression programs may become more attractive for 
animal breeders as well. 

An alternative approach is to identify the indi- 
10 viduals whose offspring are showing a higher variance 

for the trait of interest. 

b. Selective Genotypina of the Extreme Progeny 

As pointed out by Lander and Botstein (75) , the 
individuals whose genotype can be most clearly inferred 

15 from their phenotype are the ones providing most of the 

linkage information when studying complex traits* For 
quantitative traits, these are the individuals whose 
phenotypic value deviates most from the mean: the tails 
of the distribution. Sample sizes could be reduced by 

20 60% and even 80% by focusing on individuals deviating 

one and two standard deviations, respectively, from the 
mean. Paradoxically, selective genotyping may be limited 
by the size of the studied population. Indeed, a larger 
sample will be required in order to find enough individ- 

25 uals one or two standard deviations from the mean. 

c. Decreasing Environmental Variance via Progeny 
Testing 

Weller et al. (83) , have tested the effect of 
progeny testing to reduce the environmental variance by 

30 comparing the power of "daughter" and "granddaughter" 

designs for the detection of QTLs in dairy cattle. In 
the "daughter" design, marker genotype and quantitative 
trait values are assessed on daughters of heterozygous 
sires, while in the "granddaughter" design, marker 

35 genotypes are determined on sons of heterozygous sires, 
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their breeding values being determined by progeny 
testing from the quantitative trait value measured on 
their daughters. They demonstrate that for equal power 
the "granddaughter" design requires half as many marker 
5 assays as the "daughter" design. 

d. Reducing genetic noise by searching for several 
unlinked QTL simultaneously, or "simultaneous" search 
(81) . 

e. Using DNA Pools 

10 Instead of genotyping all individuals separately, 

one can analyze DNA pools from individuals sorted by 
phenotype. Significant differences of allelic fre- 
quencies between pools point towards possible genetic 
linkage between the corresponding marker locus and a 

15 gene or genes affecting the trait of interest. This 

approach can be used both for "within family" studies 
and for studies at the population level. The latter 
approach, however, requires linkage disequilibrium 
between QTL and marker locus . This method was first 

20 described by Arnheim et al. (84) to study the role of 

HLA class II loci in insulin-dependent diabetes 
mellitus. It was recently adopted by Plot sky et al. 
(85) to study association between DNA fingerprint bands 
and abdominal fat deposition in broilers. 

25 f . Exploiting "Tagged OTfcs" 

The direct effect of selection for a production 
trait will be to increase the frequency of the favorable 
alleles at the segregating QTLs. However, this selection 
pressure may indirectly affect loci in linkage disequi- 

30 librium by so-called "hitch-hiking". 

This is probably what happened to the genetic 
defect causing progressive degenerative myeloencephalo- 
pathy, or Weaver in Brown Swiss, shown to be linked to 
a major gene for milk production. Because of the 
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deleterious effect of the Weaver causing gene, it is the 
heterozygous "carrier" genotype which is selectively 
most advantageous, generating a "balanced polymorphism", 
the Weaver causing allele being maintained in the 
5 population at a relatively high frequency. This can be 

exploited, however, to map the corresponding QTL by 
going through the relatively easy exercise (compared to 
QTL mapping) of finding a marker linked to this single 
gene disorder. We have recently identified a marker 

10 linked to Weaver and presumably to the associated QTL. 

Besides Weaver, QTLs for a variety of polygenic 
traits have been identified, both in plants and animals. 
Using complete DSP maps in tomato, Paterson et al. (78) 
identified at least six genes controlling fruit mass, 

15 four controlling soluble solids, and five controlling 

fruit pH, accounting for 58%, 44% and 48%, respectively, 
of the phenotypic variance. Martin et al. (79), using 
a similar approach, identified at least three tomato 
genes controlling water use efficiency. In cattle, 

20 Geldermann et al. (86) found significant effects on milk 

yield (+ 200 kgs) and fat content (+ 1%) , especially for 
the 6-lactoglobulin locus. More recently, Cowan et al. 
(87) demonstrated significant effects on milk production 
traits using a prolactin DNA Sequence Polymorphism as 

25 marker. 

B- QTL Mapping Within Populations 

One can expect to find an effect of marker alleles 
linked to QTLs also outside of a family context, i.e., 
at the population level, if the two loci are in linkage 
30 disequilibrium. As reported by Hanset (88) , and assuming 

a diallelic marker (alleles Ml and M2 with respective 
frequencies pi and p2) linked to a diallelic QTL, the 
phenotypic difference between the respective homozygotes 



WO 92/13102 PCT/US92/00340 

-23- 



at the marker loci is: 

D 

6 = 2a . 

pl.p2 



5 with D measuring the linkage disequilibrium and 2a cor- 

responding to the phenotypic difference between the two 
homozygotes for the QTL. 

.Markers for which a priori evidence for linkage 
disequilibrium is highest are the so-called "candidate 

10 genes": genes expected from their physiological role to 

be likely candidates for the QTL itself, DSPs at those 
loci, even selectively neutral by themself, can be 
expected to exhibit linkage disequilibrium with the 
hypothetical functional mutations because of their very 

15 tight linkage. As an example, the B allele of the K— 

casein gene has been shown in several studies to 
increase protein yield in milk by about 3% f and possibly 
to improve cheese yield independent of the effect on 
protein yield (see, for instance, 89, 90). 

20 IV. USE OF DNA MARKERS IN BREEDING PROGRAMS 

In classical selection programs, breeding values 
are estimated from individuals' own performances and 
performances of relatives (136) . The expected genetic 
progress is a function of the accuracy of selection, 

25 i.e. the correlation between estimated and true breeding 

values. All direct information on QTL can be used to 
increase the accuracy of selection and, hence, genetic 
response. Early on, Soller and Beckmann (91) proposed 
to exploit marker information for the preselection of 

30 young dairy sires before progeny-test. In cattle, 

Marker Assisted Selection is already used for the sexing 
of preimplantation embryos using Y-specific probes (see, 
for instance, 92) , and for genotyping at the K-casein 
(see, for instance, 93) and prolactin loci (87) . In 
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pigs. Marker Assisted Selection is used to reduce the 
frequency of the major gene causing Porcine Stress 
Syndrome (PSS) . susceptibility to PSS correlates with 
Halothane sensitivity or Malignant Hyperthermia. This 
5 condition has been mapped to a linkage group on pig 

chromosome 6, encompassing the following markers: S(A- 
O) -GPI-Hal-H-AlBG-PGD (reviewed by 94). These markers 
are used for the Marker Assisted Selection against the 
PSS condition. Recently, the ryanodine receptor gene 
10 has been identified as a good candidate for the Malig- 

nant Hyperthermia or Hal gene (95) . 

As shown by Smith and Simpson (96) , the gain to be 
made with Marker Assisted Selection increases with the 
proportion of QTL identified and is highest for low 
15 heritability traits. Unfortunately, the QTL determining 

the latter traits are also the hardest ones to identify. 
It should be noted that the increase in accuracy is 
subordinate to the accurate estimation of the QTL 
effects. This may require larger samples than the ones 
20 needed for the detection of linkage. Once a QTL mapped 

by within-family linkage studies, it may be more effec- 
tive to identify supplementary flanking markers and to 
accurately determine the effect of the generated haplo- 
types at the population level. Selection can then focus 
25 on the best haplotype instead of spending initial selec- 

tion efforts on intermediate ones. 

The use of genetic markers in selection programs 
may as well reveal dominance deviation (particularly 
overdominance) and interaction deviation at defined QTL, 
variance components poorly dealt with in classical 
breeding theory. Specific programs may be required to 
fully exploit these QTL. In the case of overdominance 
for instance, two lines each homozygous for the differ- 
ent alleles at each QTL could be developed and crossed 
35 to produce multiple heterozygotes. 

There is widespread interest in resolving 
quantitative traits into their Mendelian components by 
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mapping the underlying QTL. The implementation of marker 
assisted selection into breeding schemes, however, has 
not always been received with enthusiasm. Part of the 
skepticism expresses the doubt that the genetic gains 
5 obtainable by marker assisted selection will justify 

expensive and tedious large scale genotyping. Although 
the costs of genotyping will drop substantially in the 
near future, due to the rapid pace at which automation 
and robotics are being applied to DNA technology, this 

10 objection remains very pertinent. 

Another major limitation of marker assisted 
selection under its present form, is its limitation to 
the exploitation of genetic variation preexisting within 
the commercial breed of interest, and only that present 

15 in a "high merit" genetic background. Favorable 

mutations appearing within a mediocre background, or 
present in "exotic" germplasm, would be difficult to 
exploit, even with markers. 

We have therefore proposed a scheme, designed as 

20 "Velogenetics" , combining marker assisted introgression 

and germ-line manipulations to reduce the generation 

interval, which might drastically increase the power of 
marker assisted selection (141) . 

IV. INDIVIDUAL IDENTIFICATION AND PATERNITY DIAGNOSIS: 

25 Methods to estimate the breeding value of an animal 

use information from relatives. As a matter of fact, 
keeping track of familial relationships has always been 
one of the major concerns of animal breeders, and 
parentage control is now a widely used procedure for 

30 several domestic species* Parentage control relies on 

the use of polymorphic systems within the studied 
population. The alleles that characterize an individual 
originate from the mother or the father. If one of the 
parents is known (usually the mother) , the alleles 

35 necessarily transmitted by the other parent can be 
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deduced easily. Paternity testing consists of scoring 
the existence or lack of those obligate paternal alleles 
in the genotype of the putative parent. Lack of one or 
more of these alleles points towards incorrectly 
5 assigned paternity. If, on the contrary, all obligate 

paternal alleles are present in the tested parent, there 
is no evidence for incorrectly assigned paternity. 
Nevertheless, one always has to consider the possibility 
of fortuitous coincidence. The higher the variation of 
the genetic markers used, the higher the probability to 
detect incorrectly assigned paternity, thus the higher 
the "exclusion power". 

Until now, the systems most often used for 
paternity testing were blood group systems, biochemical 
polymorphisms, or the major histocompatibility system. 
The availability of DSP, however, opens new perspectives 
for paternity diagnosis. Hypervariable minisatellites 
in particular, characterized by their remarkably high 
degree of polymorphism, have proven especially useful in 
this respect. Multilocus DNA fingerprints, based on the 
simultaneous detection of related minisatellite loci, 
have been shown to be extremely powerful for paternity 
diagnosis, both in human (19) and animals (108, 109). 
Exclusion powers as high as 99.999996% have been 
obtained with as few as 2 probes in the human (19). 
With such high exclusion powers, absence of exclusion 
can be considered proof for true biological parentage. 
Another corollary is that very high exclusion powers can 
be obtained even when a single parent is available and 
tested for parenthood. Multilocus DNA f ingerprints , 
however, tend to be replaced by the combined use of a 
limited number of locus-specific VNTR markers (20) , 
giving equally powerful, but more reproducible, 
sensitive and easily interpretable patterns. With the 
advent of locus-specific VNTRs and PCR-amplif iable 
microsatellites in animal species (44) , the same will 
probably hold in this field too. 
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Along the same lines, DNA markers can be used as 
well for individual identification- Using expansion- 
contraction type polymorphisms, individual specific "DNA 
bar codes" can literally be generated (19, 110). 

5 SUMMARY OF THE INVENTION 

Disclosed herein is a set of locus-specific genetic 
markers for domestic cattle and related bovids, that 
constitute a primary bovine DNA marker map. Among other 
applications, these markers and the map are useful for: 

10 - individual identification, 

- parentage testing, 

- the genetic mapping of economic trait loci, or 
genes involved in the determinism of economical li 
important traits, whether single gene traits or 

15 complex multifactorial traits, 

- marker assisted selection, 

- velogenetics , or the synergistic use of marker 
assisted introgression and germ-line manipulations 
to reduce the generation interval. 

20 The usefulness of this set of markers for the 

genetic mapping of economic trait loci is illustrated by 
the identification of a genetic marker for bovine 
progressive degenerative myelo-encephalopathy or 
"Weaver" in the Brown Swiss breed. 

25 BRIEF DESCRIPTION OF THE FIGURES 

Figure 1: shows a typical VNTR pattern obtained 
with probe GMBT-005, using Haelll. 

Figure 2: Example of a microsatellite pattern 
(TGLA9) . 

30 Figure 3: Schematic representation of 

"Velogenetics " . 
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DETAILED DESCRIPTION OF THE INVENTION 

I. CONSTRUCTION OF A PRIMARY BOVINE DNA MARKER MAP: 

Our laboratory has focused in the last two years in the 
development of a primary DNA marker map for cattle. We 
5 have now developed more than 300 highly polymorphic DNA 

markers of either of three types: 

1. Variable Number of Tandem Repeat Markers (VNTR) 
Hypervariable minisatellites sure known to show 
significant cross-hybridization between species (31, 44, 

10 110) . We have exploited this to isolate bovine VNTRs 

using heterologous minis at el lite probes. Screening 
purpose-built libraries with minisatellite probes, we 
have isolated 36 bovine VNTRs, characterized by a mean 
heterozygosity of 59.3% within the American Holstein 

15 breed. Matching probabilities and exclusion powers were 

estimated by Monte-Carlo simulation, showing that the 
top 5 to 10 probes could be used as a very efficient 
DNA-based system for individual identification and 
paternity diagnosis. The isolated VNTR systems should 

20 contribute significantly to the establishment of a 

bovine primary DNA marker map. Linkage analysis, use of 
somatic cell hybrids and in situ hybridization 
demonstrate that these bovine VNTRs are organized as 
clusters, scattered throughout the bovine genome, 

25 without evidence for proterminal confinement as in the 

human (35) . Moreover, Southern blot analysis and in 
situ hybridization demonstrate conservation of sequence 
and map location respectively of minisatellites within 
Bovidae. A typical VNTR pattern obtained with one of 

30 our probes is shown in Figure 1. Detailed description 

of our VNTR systems is reported in "Example l n . 
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2. Multisite haplotyp es 

We used 110 random cosmids to probe Southern blots 
of 9 unrelated cattle DNAs digested with 12 restriction 
enzymes. Although only one third of the expected 
fragments could be detected, 85% of the cosmids revealed 
at least one polymorphism. The mean heterozygosity of 
the generated multisite haplotypes (98) was estimated at 
51.9% . A surprisingly high proportion of polymorphisms 
(«25%) was attributed to insertion-deletion events, 
compensating for the lower level of nucleotide 
diversity, tr, observed in cattle (jt « 0.0007) as 
compared to the human. The mutation rate at cytosines 
in the CpG dinucleotide was estimated approximately 10 
times higher compared to other nucleotides. The 
generated markers should cover approximately 40% of the 
bovine genome when used in linkage studies. A detailed 
description of our multisite haplotypes is reported in 
"Example 2". 

3. Microsatellites 

20 Recently, microsatellites were proven to be an 

abundant source of highly polymorphic markers in the 
human (32-34) • As their name implies, microsatellites 
are minute VNTR markers (18-20) , characterized by tandem 
repetitions of very short repeats, one to four base 

25 pairs in length. Microsatellites exhibit levels of 

polymorphism comparable to VNTRs, but are much more 
abundant and apparently evenly spread throughout the 
genome. We have estimated the frequency of (CA)- 
dinucleotide repeats in the bovine genome at > 150,000. 

30 Because of their small size, their detection is greatly 

facilitated by PCR. Although this imposes the 
preliminary determination of flanking DNA sequences to 
design the appropriate primers, the subsequent PCR 
reaction used for their analysis offers several 

35 advantages over Southern blotting, being fast, requiring 

less DNA and being easier to automate. 
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As part of our effort to build a primary DNA marker 
map for cattle, we have isolated more than 250 bovine 
microsatellites, amplified most of them in vitro and 
shown that the majority of them are indeed polymorphic 
5 in cattle- Several of these have been tentatively 

assigned to specific bovine chromosomes using a somatic 
cell hybrid panel. Moreover, we have shown that 
approximately 50% of the bovine microsatellites can be 
successfully used in other Bovidae as well, which will 
10 greatly facilitate the construction of marker maps in 

these species. 

Magnetic solid phase DNA sequencing procedures 
(137) are used for the massive generation of sequence 
information and multiplex approaches for genotype 
15 collection, based on the simultaneous detection of 

molecules labelled with different fluorescent dyes using 
a laser-excited confocal fluorescence gel scanner (139) - 
A typical microsatellite pattern is shown in Figure 
2. A detailed escription of our microsatellites is 
20 reported in "Example 3". 

The relative location of the markers was determined 
by linkage analysis in pedigrees generated by multiple 
ovulation and embryo transfer. To assign linkage groups 
to specific chromosomes, highly polymorphic "anchor 

25 markers" were mapped using somatic cell hybrids (Jim 

Womack, Texas A&M) , and by in situ hybridization (Rudy 
Fries, ETH - Zurich) . 

Linkage analysis involving 150 of these markers, 
generated a primary DNA marker map with 24 linkage 

30 groups counting two or more markers (15 assigned to 

specific chromosomes or synteny groups) , and 68 
singleton markers. A detailed description of our pri- 
mary bovine DNA marker map is reported in "Example 4". 



WO 92/13102 



PCT/US92/00340 



II. MICROSATELLITE MAPPING OP A MAJOR GENE FOR MILK 
PRODUCTION, LINKED TO BOVINE PROGRESSIVE DEGENERATIVE 
MYELOENCEPHALOPATHY OR WEAVER. 

Identifying polygenes, requires the analysis of 
5 pedigrees of considerable size, despite the development 

of procedures such as interval mapping, simultaneous 
search, selective genotyping, etc. In this work we have 
explored an alternative approach to map a polygene, 
exploiting the association observed in cattle between 

10 the single gene disorder "Weaver 11 , and increased milk 

production. Weaver or bovine progressive degenerative 
myeloencephalopathy is a recessive disorder 
characterized by the appearance between 5 and 8 months 
of age of bilateral hind leg weakness, ataxia with 

15 deficient proprioceptive reflexes, without skeletal or 

muscular defects . Estimates of gene frequency in the 
American Brown Swiss breed point towards the maintenance 
of the Weaver gene at relatively high frequency (>5%) , 
despite the implementation of programs for detection and 

20 elimination of carrier bulls* Moreover, Hoeschele and 

Meinert (140) showed that Weaver carrier animals have an 
advantage of 690.8 kgs milk (> 0.25 phenotypic a) above 
the mean. Both observations could be accounted for by 
the presence of a gene with major effect on milk yield 

25 in linkage disequilibrium with the "weaver" gene. 

Brown Swiss animals showing symptoms of Weaver were 
identified with the help of the American Brown Swiss 
Association. Blood samples were collected from the 
affected animals, their parents, and full-siblings when 

30 available. Diagnosis of Weaver was confirmed in most 

cases by anatomopathological examination of spinal cord 
and cerebellum at the Department of Pathology of the 
College of Veterinary Medicine, Kansas State University. 
Shrunken Purkinje cells in the cerebellum, spheroids and 

35 degenerated myelin sheets in the spinal cord were 

considered pathognomonic. Altogether, 78 animals were 
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identified generating a single, large pedigree* All 
animals were genotyped for more than 70 genetic markers: 
40 Variable Number of Tandem Repeat markers and more 
than 30 Microsatellites. Linkage analysis was performed 
5 using the "LINKAGE 11 programs (60) . The microsatellite 

marker TGLA116 was giving a highly significant lodscore 
of 6.5 for a recombination rate of 7.5%. Although a 
priori probability for pair-wise linkage is unknown in 
cattle, a lodscore of 3 is generally considered to be 
10 the threshold for statistical significance as in the 

hiiman. This value (5.8) was obtained assuming complete 
penetrance. Actual penetrance for the Weaver condition 
is unknown. However, and because our pedigree was 
constructed by sampling clearly affected animals, the 
15 assumption of complete penetrance is very reasonable in 

this situation. 

The marker TGLA116 is characterized by three 
alleles segregating in our Weaver pedigree. 72% of the 
affected individuals were of the 3/3 genotype, 16% of 
20 the 2/3 genotype, and 12% of the 1/3 genotype. Hence, 

and at least in our family material, the "Weaver" allele 
was clearly associated with allele 3 at the marker 
locus. Whether similar disequilibrium will be observed 
at the population level remains to be determined. The 
25 reported lodscore values were obtained using allelic 

frequencies estimated on a sampl of 135 sires from the 
American Brown Swiss breed. 

Because of the biased sampling procedure used to 
generate the pedigree markers showing distorted segrega- 
30 . tion could generate erroneous evidence for linkage with 

the disease. A "control" pedigree, consisting of more 
than 100 Weaver- free Holstein individuals, was therefore 
typed for TGLA116 as well. The microsatellite marker 
was characterized in this pedigree by the same three 
35 alleles, with respective frequencies of 18%, 57% and 25% 

for alleles 1, 2 and 3, showing a perfect Mendel ian 
segregation. Therefore, it is concluded that marker 
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TGIA116 is genetically linked to Weaver. From the 
generated lodscore curves, the genetic distance between 
the two loci is estimated at 3 ± 10 centiMorgan. The 
limits of this 95% confidence interval correspond to 
5 recombination rates with lodscores one unit below the 

obtained maximum lodscore. Because of the tight linkage 
between TGLA116 and Weaver, this marker should be linked 
to the associated QTL as well . The distance between 
TGLA116/Weaver and QTL is f however, unknown at this 
10 point. The effect using Weaver as marker, however, was 

of such magnitude that the genetic distance separating 
these loci is unlikely to be great. We are in the 
process of determining the relative location of these 
three loci. 

15 In consequence, the TGLA116 marker will allow us to 

perform marker assisted selection against the Weaver 
condition. Indeed, it is now possible for offspring 
from individuals heterozygous for both the Weaver 
condition and TGIA116, to estimate the genotypic 

20 likelihoods at the Weaver locus based on their TGLA116 

genotype and that of their parents. 

In addition, we are now in a position to test the effect 
of the corresponding chromosomal segment on milk 
production - 

25 III. VELOGENETICS 

Few question the fundamental interest of resolving 
quantitative traits into their Mendel ian components by 
mapping the underlying QTL. The implementation of Marker 
Assisted Selection into breeding schemes, however, has 

30 not always been received with a lot of enthusiasm. Part 

of this skepticism reflects the disbelieve that DNA 
Marker Maps will become available for our domestic 
species within a reasonable time-span, or that QTL can 
be identified by linkage strategies. In our view, these 

35 arguments only reflect the lack of information of their 
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protractors- On the other hand, part of the skepticism 
expresses the doubt that the genetic gains obtainable by 
Marker Assisted selection will justify expensive and 
tedious large scale genotyping. Although the costs of 
5 genotyping will drop substantially in the near future, 

due to the amazing pace at which automation and robotics 
are applied to DNA technology, this objection remains 
quite pertinent. 

Another major limitation of Marker Assisted Selec- 

10 tion under its present form, is its limitation to the 

exploitation of genetic variation preexisting within the 
commercial breed of interest, and only if present in a 
"high merit" genetic background. Favorable mutations 
appearing within a mediocre background, or present in 

15 "exotic" germplasm, would be difficult to exploit, even 

with markers. 

We propose a scheme, combining Marker Assisted 
introgression and germ-line manipulations, to reduce the 
generation interval — which might drastically increase 
20 the power of Marker Assisted Selection: "Velogenetics" . 

A. Mar-leer As sisted Iptn-ocrr-ession 

The basic principle underlying Marker Assisted 
Introgression are well-known. A gene responsible for a 
favorable attribute can be introgressed from a "donor" 

25 strain into a "recipient" strain by repeated backcross- 

ing. During the introgression process, the retention of 
the favorable gene is monitored in the backcross pro- 
ducts, with linked, flanking DNA markers. This latter 
aspect is particularly important for traits involving 

30 multiple genes and/or characterized by sex- or age- 

limited expression. Classical genetic theory tells us 
that, with the exception of the "marked" segment whose 
retention is desired, the genomic contribution of the 
donor line is diluted by half after each backcross. 

35 Hence, and after four backcrosses, the recipient genome 

is reconstituted to ± 90% of the original. At the 
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marked locus, however, the backcross retains one copy of 
the desired "donor" variant- If required, one intercross 
will then generate 25% of offspring homozygous for the 
favorable donor variant. The net result is a "graft" of 
5 an advantageous gene within a recipient background. The 

procedures entirely respects organization and chromo- 
somal localization of the grafted gene, avoiding 
aberrant expression patterns, which are too often 
characterizing transgenes. 

10 Noteworthy, the gene to be transferred does not 

need to be cloned per se» Only its genetic map location 
is required, as defined by the availability of linked 
markers, ideally flanking the gene of interest on each 
side. Hence, this procedure is perfectly applicable for 

15 the introgression of QTL identified through the pre- 

viously described mapping strategies. 

Marker Assisted Introgression can easily be applied 
to several genes simultaneously. This feature will be 
of particular interest for complex traits involving 

20 several genes. Introgressing more than one gene from a 

donor to a recipient line, however, increases the 
selection intensity at each backcross z with 1 marker, 
1/2 of the offspring have the favorable genotype, with 
2, 1/4 and with n markers, (1/2)*. 

25 Selecting for the retention of defined "donor" 

genes will hamper the recovery of the recipient back- 
ground genotype in adjacent chromosomal regions. This 
can be compensated for by increasing the number of 
backcrosses, or better by monitoring the fate of addi- 

30 tional adjacent markers to identify the backcross 

products resulting from recombinations as close to the 
"grafted" gene as possible. 

B. Shortening the Gen eration Interval 

of Domestic Speci es bv "Veloaenesis" 

35 Introgression by repeated backcrossing, assisted or 

not by genetic markers, is common practice in a variety 
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of organisms, but is essentially unfeasible in domestic 
animals such as cattle, because of their prohibitively 
long generation time. The generation interval of such 
species could, however, be reduced based on the "in 
5 vitro" maturation and fertilization of foetal oocytes, 

hereinafter referred to as "Velogenesis". 

An overview of female gametogenesis (100,101), 
indicates that the feasibility of such a scheme may not 
be that far-fetched. Briefly, oogenesis begins with the 
10 formation of primordial germ cells in the region of the 

allantois. These precursor cells migrate to the 
developing gonads where after a period of mitotic 
proliferation, they enter meiosis. Heiosis is arrested 
at the diplotene stage of prophase I by the poorly 
15 understood "meiotic division I arrest system", after 

which the primary oocyte enters a resting phase. During 
the life time of the animal, small numbers of resting 
primary oocytes are successively recruited into a pool 
of growing oocytes, within the environment of a 
20 gonadotropin-dependent developing follicle. These 

activated oocytes growth in size, acquire the competence 
to resume meiosis if appropriately stimulated, and 
accumulate the required material to sustain the early 
stages of the subsequent embryonal development. 
25 Resumption of meiosis and oocyte maturation is triggered 

by a hypothetical maturation- inducing signal produced by 
granulosa cells in response to gonadotropins. At least 
in rodents, oocyte maturation seems to be mediated by a 
drop in cyclic AMP in the oocyte and subsequent 
30 inactivation of a type A protein kinase. Evidence for 

the role of this pathway in oocyte maturation is, 
however, much more controversial in ruminants. Note 
that in the granulosa cells, gonadotropins act, among 
other pathways, through the activation of adenylate 
35 cyclase with subsequent increase in cAMP concentrations 

(102). In the oocyte, a cascade of still to be deter- 
mined events then probably leads to the phosphorylation 
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and activation of a phosphatase, probably homologous to 
the S. Pombe cdc25 gene (103) , which will itself dephos- 
phorylate and activate the M-phase promoting factor 
(MPF) , now known to be a complex of a p34 cdc2 protein 
5 kinase subunit with a B type cyclin (see 104 for a 

review) . The maturing oocyte completes the first 
meiotic division and enters the second (becoming a 
secondary oocyte) which will be arrested as well at 
metaphase II until fertilization. This "meiotic 

10 division II arrest system" is thought to reflect the 

stabilization of MPF mediated by the kinase activity of 
pp39 mos on either a cyclin protease or on cyclin itself. 
Fertilization relieves this block, by increasing the 
intracellular Ca 2+ concentration, triggering calcium- 

15 dependent protease activity (reviewed in 104) . 

In cattle, primordial germ cells reach the genital 
ridge at about 40 days of gestation. After a period of 
mitotic proliferation, they differentiate into oogonia 
starting around 60 days of gestation. Mitotic prolifera- 

20 tion of the germ line ceases around day 170 of gestation 

fixing the maximum number of oocytes the female will 
ever have. Meiosis starts at about 80 days, and the 
first primordial follicles are discernable at 90 days of 
gestation. Remarkably, activation of resting primordial 

25 follicles starts already in utero , around day 140, and 

secondary and tertiary follicles can be seen at 210 and 
230 days, respectively. It is estimated that 2 to 4 
resting primordial follicles are recruited daily into 
the pool of activated, developing follicles. These 

30 activated foetal oocytes, however, are irrevocably 

committed to follicular atresia. Indeed, spontaneous 
oocyte maturation and ovulation do not begin until 
puberty. Submitted to appropriate hormonal stimulus, 
however, prepubertal oocytes can resume meiosis, can be 

35 fertilized and can produce viable offspring. Indeed, 

offspring have been obtained from gonadotropin- 
stimulated calf oocytes, transferred to postpubertal 
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recipient animals (reviewed in 105) . The purpose of 
velogenesis would be to attempt to obtain similar 
results with foetal oocytes at the earliest stage 
possible, as early as 90 to 180 days of gestation. 
5 Very encouraging is the development in mice of 

culture systems supporting the growth of primary 
follicles, yielding mature oocytes capable of fertil- 
ization in vitro and development to term (106, 107) . It 
is reasonable to anticipate that similar conditions, 
10 supporting development of bovine oocytes, will become 

available in a species were primary oocytes from 
relatively small antral follicles can already be 
successfully matured and fertilized ir* vitro ■ 

On way to achieve velogenesis would be to attempt 
15 to rescue oocyte nuclei from primordial follicles by 

their transfer into enucleated, maturable oocytes. 

So far we have only discussed velogenesis through 
the reduction of the female generation interval. n Male w 
velogenesis could similarly be accomplished by the early 
20 stimulation of spermatogenesis. 

The impact on breeding programs of "velogenesis" or 
the reduction of generation time by in vitro maturation 
and fertilization of fetal oocytes has been discussed by 
Betteridge et al. (101). In dairy breeding, for 
25 instance, annual responses in milk yield could be 

doubled compared to conventional progeny testing. With 
the added power of Marker Assisted Introgression, the 
approach becomes much more powerful. "Velogenetics", or 
the synergistic use of Marker Assisted Introgression and 
30 "velogenesis", can be viewed as a procedure for the 

rapid and efficient intraspecies transfer of desirable 
genes between genetic backgrounds. By analogy with the 
term "transgene", the manipulated genes are referred to 
as "velogenes" . 

35 In particular, desirable traits identified outside 

commercial breeding stock, could be quickly introgressed 
into high merit genetic backgrounds. Examples would 
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include disease resistance, genes affecting milk and 
meat composition, Polled, coat color genes, etc. More- 
over, the possibility to exploit "exotic" genetic 
variation identified outside the breed of interest is 
5 particularly attractive because it greatly facilitates 

the mapping of the genes of interest. 

A schematic representation of "Velogenetics" is 
shown in Figure 3. 



The present invention is further detailed in the 
10 following Examples, which are offered by way of illus- 

tration and are not intended to limit the invention in 
any manner. Standard techniques well known in the art 
or the techniques specifically described below were 
utilized. 

15 EXAMPLE 1 



CHARACTERIZATION OF 
A SET OF VARIABLE NUMBER OF TANDEM REPEAT MARKERS 
CONSERVED IN BOVTDAE . 



INTRODUCTION 

20 Human minisatellite sequences, exhibiting very high 

levels of genetic polymorphism due to variation in the 
number of tandem repetitions, have proven an invaluable 
source of genetic markers commonly termed n VNTRs w (18- 
20) . VNTRs have been instrumental in the genetic mapping 

25 of several disease-causing genes, as tools for 

individual identification and paternity diagnosis and to 
address a variety of biological issues, including 
imprinting, loss of heterozygosity in malignancies, etc. 
In animal genetics, highly polymorphic markers such 

30 as VNTRs could similarly be used for individual 

identification and paternity diagnosis - relying today 
on less informative biochemical polymorphisms and blood 
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group systems and for the napping of so-called 

economic traits loci (ETL) or genes involved in the 
determination of production traits. Classically, 
artificial selection has relied on the biometrical 
5 evaluation of breeding values from individual 

performance records and from performances of relatives 
(136) . One of the powers of the biometrical approach is 
that it obviates the need for any detailed molecular 
knowledge of the underlying genes or ETL. However, it 

10 is believed that the genetic mapping of ETL could be 

used to increase genetic response by affecting accuracy 
and speed of selection, through a procedure called 
marker assisted selection (MAS) (91, 96). Moreover, 
defined alleles could be moved efficiently between 

15 genetic backgrounds by velogenetics or the combined use 

of marker assisted introgression and germline 
manipulations aimed to reduce the generation interval 
(141, 142). 

We report the cloning and characterization of 36 
20 bovine variable number of tandem repeat (VNTR) markers, 

characterized by a high degree of polymorphism within 
commercial herds and shown to be conserved within 
Bovidae. 

MATERIALS AND METHODS 



25 



l. Cloning of bovine VHTRs and detection of 
polymorphism: 

500/tg genomic DNA from 20 unrelated cows was 
digested to completion with Mbol or Haelll. After two 
rounds of size fractionation by agarose gel 
electrophoresis, electroelution and addition of EcoRI 
linkers (only for Haelll restricted DMA) , fractions from 
3 to 4 Kb (kilobases), from 4 to 6 Kb and above 6 Kb 
were ligated into the BAP-dephosphorylated BamHI and 
EcoRI sites, respectively of pUC13. Approximately 
35 80,000 independent clones were obtained by 
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transformation of DH5a cells, and were screened 
successively with the following minisatellite sequences: 
the 282 base pair Haelll-Clal fragment containing the 
minisatellite in the protein III gene of wild-type M13, 
5 pUCJ, pSP64.2.5EI ('Per')/ pce3'HVR64, pINS310, EFD134.7 

and pS3 (20, 21, 110, 123, 143). Hybridization and 
washings were done in the conditions used to generate 
multilocus DNA fingerprints with the respective probes 
(110, 143). To check for polymorphism, plasmid DNA 

10 isolated from positive colonies was used to probe Mbol, 

Haelll and TaqI Southern blots of 18 randomly selected 
American Holsteins. Hybridizations were done at 65 C in 
7% SDS, 10% PEG, 50mM NaHP04 with addition of 50/ig/ml 
bovine genomic DNA. Final washes were at 65 C in 

15 O.lxSSC, 0.1% SDS. When using bovine probes on ovine 

Southern blots, hybridization and washing temperatures 
were reduced by 10 C. 

2. Estimation of Matching Probabilities and 
Exclusion Powers: 

Allelic frequencies were estimated from the sample 
of 18 randomly selected American Holsteins. Matching 
probabilities and exclusion powers (113) were then 
estimated by Monte-Carlo simulation (10,000 simulations 
in each case) , assuming Hardy-Weinberg equilibrium and 
using "Pat-Power", a program designed by one of us. The 
following parameters were estimated: MPR: matching 
probability for two randomly selected individuals; MPS: 
matching probability for full-sibs; EPR: probability to 
exclude an alleged father unrelated to the real one 
(mothers phenotype known); EPS: probability to exclude 
an alleged father full-sib to the real one (mothers 
phenotype known) ; EP1: probability to exclude a wrongly 
assigned parent without phenotypic information available 
from the other one. 
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Patpower calculates Matching Probabilities and 
Exclusion Powers characterizing given autosomal poly- 
morphic systems, by Monte-Carlo simulation. 

Matching Probabilities relate to individual identi- 
5 f ication and express the likelihood that two individuals 

would have the same pattern with a given probe, Patpower 
calculates two types of Matching Probabilities: MPR, the 
Matching Probability for two unrelated individuals, and 
MPS, the Matching Probability for two full-sibs. 

10 Exclusion Powers relate to paternity diagnosis arid 

express the likelihood that a wrongly assigned paternity 
or maternity will be detected with a given probe, 
Patpower determines three types of Exclusion Power: EPR, 
where one parent is known with certainty, the proband is 

15 unrelated to the other real parent; EPS, where one 

parent is known with certainty, the proband is full-sib 
of the other real parent; and EP1, whre only the proband 
is available. 

The user needs to input the number of alleles char- 

20 acterizing the polymorphic system in the population of 

interest, their respective frequencies, and their 
dominance-recessivity relationships* For the ABO blood 
group system in humans, for instance, A and B are 
codominant and both dominate 0. Each allele is given a 

25 binary code following the rules of the "LINKAGE" program 

(60). 

Patpower then stochastically generates a pair of 
parents with an offspring, a full-sib of the real father 
and an unrelated individual. "Phenotypes" are obtained 

30 from the genotype using the boolean "or" operator and 

are used to determine matching between unrelated indivi- 
duals (MPR) and between full-sibs (MPS) , as well as the 
exclusion of the unrelated individual considered as a 
proband, with (EPR) and without information (EP1) from 

35 one of the real parents, and exclusion of the uncle 

considering information from the real mother (EPS) . 
This simulation is repeated as many times as determined 
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by the user, allowing for the estimation of the respec- 
tive livelihoods. 

3. Segregation and linkage analysis: 

All members of an American Holstein pedigree with 
5 91 offspring obtained by multiple ovulation arid embryo 

transfer (MOET) from 20 parents, were genotyped for all 
identified markers. Each parent has a mean of 9.1 
offspring with a mean of 1.9 partners. Segregation and 
linkage analysis were done with slightly modified 
10 versions of the n LINKAGE M programs as previously 

described (31) . 

4. Synteny mapping: 

The hybrid somatic cells were prepared by fusion as 
previously described (97) . Southern blot hybridization 
15 and concordancy analysis were done according to 

Threadgill et al. (114) . 

5. in situ hybridization: 

Chromosomes were prepared as described by Fries et 
al. (115) and chromosome identification was based on 
20 QFQ-banding and according to the international standard 

(116) . Probe preparation and in situ hybridization were 
as previously described (144) . 

RESULTS 

1. A set of bovine VNTR markers: 

25 Using the strategy described above, we have 

isolated a total of 36 bovine VNTRs, listed in Table 1. 
Polymorphic patterns were attributed to minisatellite 
sequences when characterized by more than two alleles 
distinguishable with more than one restriction enzyme. 

30 Seven additional, non VNTR-type polymorphisms were 

detected during this experiment and are reported as 
well. 
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TABLE 1 
VNTR Clones 



10 



15 



20 



25 



30 



35 



Name 


IjOCUS 1 


Polvmorph 2 


JBnz. 3 


Het- 4 


Ovin 5 


GMBT-002 


DY1651b 


VNTR 


Haelll 


52 


P 


GMBT-003 




VNTR 


Haelll 


56 




GMBT-005 


D24Sib 


VNTR 


Haelll 


85 


N 


GMBT-006 


D14Sib 


VNTR 


Haelll 


73 


N 


GMBT-007 


DUlOSib 


VNTR 


Haelll 


96 


P 


GNBT-008 




VNTR 


TagI 




P 


GMBT-009 


DU22Sib 


VNTR 


Haelll 


58 




GMBT-011 


D26Sib 


VNTR 


HaeKI 


85 


P 


GMBT-012 




VNTR 


Mbol 


22 




GMBT-013 




VNTR 


gaelll 


4 




GMBT-015 


D21S3b 


VNTR 


Haelll 


61 


P 


GMBT-016 


D21S12b 


VNTR 


Haelll 


78 


P 


GMBT-017 


D8Sib 


VNTR 


Haelll 


15 


M 


GMBT-019 


DIOSib 


VNTR 


Mbol 


7 




GMBT-020 




VNTR 


Mbol 


65 




GMBT-021 


D21S2b 


VNTR 


Haelll 


65 




GMBT-022 


D18Sib 


VNTR 


Mbol 


40 




GMBT-025 




VNTR 


Haelll 


25 




GMBT-026 




VNTR 


Haelll 


26 




GMBT-027 




VNTR 


Mbol 


40 




GMBT-028 




VNTR 


Haelll 


81 




GMBT-031 




VNTR 


Haelll 


58 




GMBT-033 




VNTR 


Haelll 


70 




GMBT-034 




VNTR 


gaelll 


20 




GMBT-035 




VNTR 


Haelll 


59 




GMBT-036 


DU27Sib 


VNTR 


Haelll 


89 




GMBT-039 




VNTR 


Haelll 


33 




GMBT-041 


D23Sib 


VNTR 


paelll 


81 




GMBT-042 




VNTR 


Haelll 


78 




GMBT-047 


D2S2b 


VNTR 


Haelll 


65 




GMBT-049 




VNTR+PM 


Haelll, 










Mbo 
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TABLE 1 (Continued) 
VNTR Clones 



10 





LOCUS 1 


PQlYWPJTPft 2 




Het.* 


GMBT-051 




VNTR 


Haelll 


94 


GMBT-053 




VNTR 


HaeHI 


59 


GMBT-058 




VNTR 


TaqI 


89 


GMBT-059 


DU10S2b 


VNTR 


BamHI 


67 


GMBT-060 




VNTR+PM 


Mspl 


87 


GMBT-002 




PM 


TaqI 


37 


GMBT-024 




PM 


TaqI 


62 


GMBT-029 




PM 


Mbol 


28 


GMBT-014 


DU22S2b 


(?) 


Mbol, TagI 68 


GMBT-018 




(?) 


TaqI 


17 



ovin 5 



15 i LOCUS: locus name following HGM nomenclature rules 

whenever available from mapping studies. 

2 POLYMORPH: type of polymorphism (VNTR: Variable Num- 
ber of Tandem Repeats; PM: Point Mutation; (?) : un- 
explained) . 

20 3 ENZ : preferred restriction enzyme for its detection, 

4 HET: heterozygosity within Holszteins, estimated 
from a sampel of 27 presumably unrelated Holstein 
animals . 

5 OVIN: cross-reaction in sheep; N, negatie; M, mono- 
25 morphic; P, polymorphic, not tested. 



WO 92/13102 



PCT/US92/00340 



-46- 

Within the American Holstein breed, the mean 
heterozygosity over all VNTR systems was 59.3%. When 
using probe GMBT-016 with Mbol instead of Haelll , and 
supposedly because of the presence of minisatellite 
5 variant repeats (MVR) (39) harboring Mbol sites, an 

extremely variable, locus-specific "midisatellite" 
pattern (36) is generated (data not shown) . Used with 
Mbol, this probe is particularly powerful for individual 
identification. 

10 We found one clear instance of maternal neomutation with 

probe GMBT-022. Besides this, all probes showed proper 
Mendel i an segregation. 

Table 2 reports estimated matching probabilities 
and exclusion powers as well. Systems GMBT-009, GMBT- 

15 Oil and GMBT-022 were treated as "open" systems, meaning 

that - because of their small size - some alleles were 
not detectable in our conditions. To avoid ambiguities 
in identification and paternity diagnosis, these 
unidentified alleles were pooled in a single "recessive" 

20 class. For individuals showing a single band, no 

distinction was made between homozygosity and 
heterozygosity based on band intensity. 

Discrepancies between probe ranking according to 
heterozygosity versus ranking according to matching 

25 probabilities and exclusion power, most probably results 

from the small sample size used to estimate both types 
of parameters. However, heterozygote advantage at some 
loci could be an alternative although unlikely 
explanation in view of the apparent neutral behaviour of 

30 human minisatellite sequences (124). 
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TABIiB 2 

Matching Probabilities for VMTO Clones 



10 



15 



20 



25 



30 





MPR 


MPS 2 


BPR 3 


BBS* 


BPI 


GMBT-002 


21.24 


52.05 


36.27 


17.31 


19.66 


GMBT-003 


24.25 


53.53 


30.90 


15.37 


16.56 


6MBT-005 


12.02 


43.24 


49.53 


24.18 


31.99 


GMBT-006 


19.38 


49.49 


36.93 


18.28 


20.56 


GMBT-007 


03.25 


33.38 


72.45 


34.97 


56.19 


GHBT-008 


— 


— 




— 




GMBT-009 


12.13 


43.73 


45.86 


21.84 


30.42 


GMBT-011 


10.88 


42.14 


49.34 


24.34 


33.24 


GMBT-012 


44.09 


68.44 


17.95 


08.87 


06.45 


GMBT-013 


92.28 


96.28 


01.96 


00.84 


00.16 


GMBT-015 


12.99 


45.00 


47.48 


22.82 


29.82 


GMBT-016 


12.59 


44.10 


48.87 


24.15 


30.17 


GMBT-017 


64.57 


81.50 


09.92 


04.84 


22.60 


GMBT-019 


42.29 


96.26 


01.91 


00.89 


00.07 


GMBT-020 


48.29 


70.38 


13.83 


07.61 


05.78 


6MBT-021 


21.22 


50.93 


35.26 


17.79 


19.21 


GMBT-022 


02.39 


32.78 


75.43 


37.23 


61.60 


GMBT-025 


50.37 


72.51 


14.04 


07.47 


05.04 


GMBT-026 


58.10 


77.26 


13.16 


06.39 


03.11 


GMBT-027 






™™" 






GHBT-028 


03.19 


32.60 


74.11 


36.45 


58.33 


GMBT-031 


20.82 


50.89 


36.05 


17.81 


20.19 


GMBT-033 












GMBT-034 


71*41 


84.57 


07.72 


03.74 


01.44 


GMBT-035 


24.24 


53.17 


32.31 


15.86 


16.80 


GMBT-036 


04.19 


35.42 


69.74 


33.07 


53.05 


GMBT-039 


39.79 


63.90 


17.83 


08.65 


10.16 


GHBT-041 


12.18 


43.84 


48.09 


22.72 


30.41 


GMBT-042 


07.38 


38.85 


57.61 


27.43 


40.02 


GMBT-047 


30.09 


55.38 


25.43 


12.56 


14.79 


GKBT-049 
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table 2 (Continued) 
Matching Probabilit ies for VHTR Clones 



5 



10 



Egme 

GMBT-051 

GMBT-053 

GMBT-058 

GMBT-059 

GMBT-060 

GMBT-002 

GMBT-024 

GMBT-029 

GMBT-014 

GHBT-018 



MPR 1 

02.40 

15.16 

08.29 

16.87 

14.28 

49.60 

24.99 

40.24 

50.36 

85.79 



MPS 2 

32.23 

44.76 

40.06 

45.92 

44.92 

71.08 

53.36 

61.84 

72.43 

92.78 



EPR 3 

76.37 

43.80 

55.59 

40.31 

44.23 

13.71 

30.48 

18.34 

06.32 

03.59 



82§* 

38.01 
21.04 
16.56 
19.40 
21.78 
06.71 
14.96 
08.75 
03.43 
01.86 



BPI 5 

61.37 

27.63 

38.03 

24.04 

26.89 

05.38 

16.51 

10.99 

00.10 

00.27 



1 MPR is Matching Probability for two randomly selec- 
ted individuals. 

2 MPS is Matching Probability for two full-Sibs. 

3 EPR is Exclusion Power when putative father is 
unrelated to real father. 

4 EPS is Exclusion Power when putative father and 
real father are full-Sibs. 

5 BPI (or EPSP) is Exclusion Power when only one 
parent is available. 
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2. Genomic distribution: 

We performed pair-wise linkage analysis between all 
markers. As it is known that at least some of these 
sequences are organized as minisatellite clusters (31 , 
5 35, 145), we were expecting to find tightly linked 

systems. We found evidence for five pairs of linked 
markers of which four were characterized by a 
recombination rate inferior to 5% (Table 3). However, 
two of the systems involved in a tight linkage detect 

10 non VNTR-type polymorphisms (GMBT-014, GMBT-022) . The 

corresponding probes were probably isolated because they 
contain a genuine although non-polymorphic 
minisatellite, and were fortuitously detecting other 
types of polymorphism. Despite these five linked pairs, 

15 results of the linkage analysis pointed towards a 

scattering of these markers throughout the bovine 
genome. 



TABUS 3 



Linked Svstems 


G 1 


lodscore? 


GMBT-003 and GMBT-029 


0.0% 


5.00 


GMBT-007 and GMBT-059 


11-3% 


9.11 


GMBT-009 and GMBT-014 


4.8% 


3.74 


GMBT-015 and GMBT-016 


3.7% 


27.00 


GMBT-028 and GMBT-047 


2.5% 


9.40 



25 

1 9 = recombination rate. 

2 pair-wise lodscores were calculated 
with the "LINKAGE" programs. 



Reference markers for the respective synteny groups 
30 were U1:GNB1, D2:MEl f U3:NKNB, U4:MPI, U5:FOS, U6:AMY1, 

U7 : LDHA , U8:GNB2, U9:GPI, U10:SOD1, UllrVIM, U12:GPX1, 
U13:MET, U14:GSR, U15:CASK, U16:ABL, U17:CRYG, 
U18:GGTB2, U19:CAT, U20:GL01, U21:GH, U22:AMH, 
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U23:ALDH2, D24:TG r U25:CLTIA, U26:OAT, U27:DU27Slb, 
U28:MBP, U29:RBP3 and X:DHD. Synteny groups with highest 
concordancy scores, to which corresponding VNTRs were 
assigned, are underlined. 
5 Evidence for a broad genomic distribution of our 

VNTRs was supported by the tentative assignment of 13 of 
them to 11 different synteny groups using somatic cell 
hybrids (Table 4) . GMBT036 identifies a previously 
unmarked bovine synteny group. Probe GMBT-021 was 
10 assigned to the same synteny group as probes GMBT-015 

and -016. Although the latter two probes were shown to 
be tightly linked, linkage between those probes and the 
former one could be excluded for recombination rates < 
15%. 
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Eight VNTRs were as well mapped by n in situ" 
hybridization: GMBT-006 to 14qll-16, GMBT-005 to 
24ql3.3-22, GMBT-011 to 26qll-21, GMBT-015 and GMBT-016 
to 21q22-24, GMBT-019 to 10ql4-23, GMBT-022 to 19q21-23, 
5 GMBT-028 to 2ql3-21. Again good genomic coverage was 

evident, since six probes mapped to five different 
chromosomes. Probes GMBT-015 and -016 both mapped to 
21q 23-24 as expected from the linkage study and the 
assignment on the hybrid panel. Surprisingly, five out 
10 of the eight VNTRs clearly showed an interstitial map 

location. Only probes GMBT-015, -016 and -022 were 
located proterminally, the former two identifying the 
same minisatellite cluster. These results seem to 
contrast with those of Royle et al. (35), which 
15 demonstrated preferential proterminal mapping of human 

VNTRs. Probe GMBT011, previously located on U26 f was 
mapped to chromosome 26, allowing us to tentatively 
assign synteny group U26 to chromosome 26. 

3. Conservation of sequence and map location within 
Bovidae: 

We hybridized ten bovine VNTRs to sheep Southern 
blots, under slightly reduced stringency conditions. 
Seven of them were yielding locus specific patterns, of 
which six were showing a substantial degree of poly- 
morphism in a sample of 5 unrelated sheep (Table 1) . 

Probes GMBT-016, -019 and -022, mapping in the 
bovine to 21q23-qter, 10ql5-q24 and 19q21-qter 
respectively, were mapped by in situ in sheep as well. 
The three probes produced signals on chromosomes 18, 7 
and 11, recognized as evolutionary homologues of bovine 
chromosomes 21, 10 and 19 (116) . Moreover, the signals 
were found over the exact positions as expected in case 
of conservation of chromosomal location in cattle and 
sheep. 



20 



25 



30 
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DISCUSSION 



To isolate bovine VNTRs, we have used a strategy 
similar to Wong et al. (146, 147), based on the 
screening of size-selected restriction fragments 
5 obtained by complete digestion with the four-cutters 

Mbol and Haelll. Advantages of this strategy are: (1) 
the complexity of this size-range is substantially 
reduced; following Bishop et al. (128) and assuming an 
exponential distribution of restriction fragment 

10 lengths, the fragments > 2 Kb represent about 10" 3 of the 

total number of Mbol or Haelll fragments, corresponding 
to approximately 10* fragments; this allows us to work 
readily with plasmid vectors; (2) the subsequent search 
for and use of the polymorphism is performed with the 

15 same enzyme used to generate the libraries, obviating 

the need to screen several restriction enzymes, hence 
reducing costs; (3) relying on frequent four-cutters, 
the cloned minisatellites contain very little flanking 
sequences and only very few of them carry highly 

20 repeated sequences which would interfere during 

hybridization; (4) theoretically, the larger 
minisatellites targeted by this approach are more likely 
to be involved in mutational events and could therefore 
be the more polymorphic ones. 

25 A disadvantage of this approach is the unequal 

representation of minisatellite loci in our library. 
The libraries were generated with a mixture of DNA from 
20 unrelated individuals, to increase the number of 
clonable microsatellites. As a consequence, loci for 

30 which most alleles are within the selected size range 

will be overrepresented, compared to loci for which the 
majority of alleles in the population are bellow this 
range. 

This collection of bovine VNTRs could be used for 
35 DNA based individual identification and paternity 

diagnosis. Combining our top 5 probes, matching 



WO 92/13102 



PCT/US92/00340 



-55- 



probabilities and exclusion powers at least as good as 
those obtained with classical systems are obtained: MPR: 
8xl0~ 8 , MPS: 4xl0~ 3 , EPR: 0.999, EPS: 0.893, EP1: 0.987. 
Adding more probes will of course only increase the 
5 power of the system. As a matter of fact, these probes 

have been used efficiently to solve paternity problems 
beyond the power of blood group systems* DNA typing is 
not limited to blood samples as present systems are, 
which expands its spectrum of applications and power. 

10 As an example, DNA typing has been used to deal 

efficiently with fetal blood cell chimerism (127) , 
frequently encountered in cattle. Compared with 
multilocus DNA fingerprints, locus-specific VNTRs are 
much easier to interpret and are more reproducible. 

15 Following properly established standardization 

procedures, a "common language" could be established 
allowing exchange of information between laboratories . 
It is noteworthy that heterozygosity and allelic 
frequencies for some probes seem to vary substantially 

20 between breeds. As an example, probe GMBT-012 is 

characterized by an heterozygosity of 22% in Holsteins, 
but higher than 50% in both Herefords and Brown Swiss. 
Hence, proper use of these probes may initially require 
accurate estimation of genetic variation for different 

25 breeds. 

Assuming a coverage of 20 cM per marker in linkage 
studies, the set of markers described in this paper 
would allow the scanning of approximately 7 Morgans. 
Accepting a total map length for the bovine genome of 25 

30 Morgans (148), this represents close to 33%. We have 

complemented the set of bovine VNTR described in this 
paper with over 80 multisite haplotypes, generated with 
cosmid probes, and more than 100 microsatellite systems 
(31, 148, 149). Therefore, the majority of the bovine 

35 genome is now amenable to linkage scanning. Since 

several of these markers are already "anchored" to 
specific chromosomes or synteny groups, a primary bovine 
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DNA marker map should soon be available. Moreover, the 
remarkable conservation of mini- and microsatellites 
within Bovidae will substantially accelerate the 
construction of genetic maps in sheep and goats and 
offer the possibility to address interesting 
evolutionary issues. 

EXAMPLE 2 

GENERATION OF BOVINE MULTISITE HAPLOTYPBS 
USING RANDOM COSMID CLONES. 



10 



15 



20 



25 



30 



INTRODUCTION 

The possibility to generate nearly unlimited 
numbers of genetic markers through the study of DNA 
Sequence Polymorphism (DSP) (76) , has revolutionized 
human genetics: genetic markers have been used to map 
genes involved in a variety of human diseases, which has 
direct implications for genetic counselling strategies 
and is a first step towards their subsequent cloning by 
reverse genetics; they are revolutionizing individual 
identification and the examination of familial 
relationships; and they are invaluable tools in the 
study of a wide variety of biological issues. In 
particular, they are expected to play a key role in the 
ongoing efforts to entirely map and sequence the human 
genome. 

For breeders of domestic animal species, the 
availability of large numbers of genetic markers means, 
besides new approaches for individual identification and 
paternity diagnosis, the possibility to map and study 
genes determining production traits , and to use thxs 
information in marker assisted selection and 
velogenetics (91, 96, 141, 150). 

Particularly challenging is the fact that the 
majority of production traits are complex, 
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multifactorial traits. Animal breeders, however, have 
the advantage that phenotypic information has been 
carefully recorded for thousands of animals over the 
years for use in classical biometrical breeding 
5 programs, and that they can, if necessary, design and 

generate the ideal family material required for such 
mapping studies. 

In both the human and animal field, polymorphic 
markers characterized by the highest possible hetero- 

1° zygosities or "Polymorphism Information Content" 

(PIC) (76) is paramount. Hence, the focus has changed 
from the original diallelic Restriction Fragment Length 
Polymorphisms (RFLPs) to more informative systems based 
on the study of sequences such as minisatellites (18, 

15 20) , and more recently microsatellites (32-34) and the 

polydeoxyadenylate tract of SINE-repetitive elements 
(37) . Minisatellite sequences in particular have proven 
very powerful. They seem to suffer, however, from a 
non-random genomic distribution, especially in the human 

20 where in addition, they show preterminal confinement 

(35) • Microsatellites, although very abundant and 
highly polymorphic, require prior sequencing efforts to 
generate the primers needed for their in vitro 
amplification. Moreover, the large scale use of 

25 microsatellites requires the development of more 

efficient multiplex amplification and data collection 
schemes. 

An alternative strategy for the generation of 
highly informative marker systems is to combine several, 

30 closely spaced diallelic RFLPs into more informative 

polyallelic multisite haplotypes (98) • We have explored 
the use of random bovine cosmid clones in Southern blot 
hybridizations in order to identify such sets of closely 
spaced DNA Sequence Polymorphisms. Because of the 

35 population structure imposed by breeding strategies, 

effective population sizes of domestic species are 
expected to be reduced compared to humans. It was 
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interesting therefore, to check in how far this would 
decrease the observed level of genetic variation and in 
how far the expected concomitant increase in linkage 
disequilibrium would affect the efficiency of the chosen 
5 approach in domestic animal populations. 

MATERIALS AND METHODS 
1. Preparation of cosmid clones; 

Bovine genomic DNA was prepared using standard 
procedures, partially digested with Mbol, and size 

10 fractionated by rate zonal centrifugation in a 10%-40% 

sucrose gradient. Size fractions around 40 Kb were 
ligated into the Xhol site of the cosmid vector pWEC 
(pWE 15 vector (Stratagene) with pUC18 polylinker - 
Erica Cumlin, personal communication) , after partial 

15 fill-in of the insert and vector sticky ends with 

respectively dATP, dGTP and dCTP, dTTP. The obtained 
constructs were packaged into Gigapack II Gold extracts 
(Stratagene) and used to infect E.Coli 49 OA hosts (gift 
from R. White, University of Utah, Salt Lake City, Utah, 

20 USA) • 110 colonies were selected at random, cosmid DNA 

was prepared using standard procedures, and purified by 
CsCl/Ethidium Bromide isopycnic centrifugation. 



2. Southern blot hybridization: 

Genomic DNA from 9 unrelated fiolstein individuals 
25 was prepared from venous blood using standard procedures 

and digested with 5U/pg of the following enzymes in the 
presence of 4mM spermidine: BamHI, Bgll, Bglll, EcoRI, 
EcoRV, Hindlll, Kpnl, Mspl, PstI, PvuII, TagI and Xbal. 
4/ig DNA per individual was separated according to size 
30 by agarose gel electrophoresis and blotted onto Pall 

Biodyne B membranes using NaOH 0.4M as transfer buffer. 
Membranes were prehybridized at 65 C for 4 hours in 10% 
PEG, 7%SDS, 50mM NaHP04 (pH 7-2) in the presence of 
350/ig/ml bovine genomic DNA. Cosmid DNA was labelled by 
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random-priming (111) to specific activities of .5*10 9 
cpm//xg, prehybridized with bovine genomic DNA (5mg/ml) 
for 90 min, at 68 C (112), and added to the 
prehybridized membranes for 16 hours. Final washes were 
5 in 0.1XSSC, 0.1%SDS and at 65 c. Autoradiography was 

carried out for 2 to 6 days at - 80 C with Kodak XAR-5 
film and intensifying screens (DuPont Cronex Lightning- 
Plus) . Membranes were stripped by boiling into 0.1%SDS 
and reused up to at least 10 successive times. 

10 3, Calculation of nucleotide diversities; 

Nucleotide diversities, tt, corresponding to the 
average heterozygosity per nucleotide site were 
estimated following Ewens et al. (130), using: 

15 7T = 

2Zr 1 m i lnn 1 

where n A stands for the number of chromosomes studied 
with the X th enzyme, r A for the number of bp of the 
recognition sequence of the i" 1 enzyme, m ± for the number 
20 of cleavage sites explored with that enzyme, of which k ± 

are polymorphic. 

The number of explored restriction sites was 
estimated from the number of fragments f observed by 
Southern blotting, using m = (3f+l)/2 (119) . 



25 RESULTS 

110 randomly selected bovine cosmid clones were 
used in Southern blot hybridization experiments as 
described in Materials and Methods. 96 of them, or 
87.2%, gave usable patterns and were kept for further 
30 analysis. Combining data from the 12 restriction 

enzymes used, a mean of 53.87 fragments per cosmid 



WO 92/13102 



PCT/US92/00340 



-60- 



qualified as unambiguously readable. The only RFLPs 
considered in this paper, are the ones affecting these 
selected fragments. 

Assuming a cosmid insert size of 40 Kb and 
5 estimating the mean restriction fragment length in Kb 

(L) for a restriction enzyme according to Bishop et al. 
(128) , the expected number of restriction fragments 
detected in Southern blot hybridization per cosmid probe 
for a given enzyme can be approximated by: 
10 integer ( 4 0/L)+l. For our 12 enzymes, we expect 

therefore a total of 173 fragments per cosmid probe - 
Therefore , the 53.87 fragments actually observed per 
clone, represent only 31% or less than 1/3 of what is 
theoretically possible. The remaining 69% are missed 
15 either because they were considered difficult to read, 

or more often because they went undetected due to their 
abundance in highly repetitive elements blocked by the 
competitor DNA, or due to their size, too small for 
efficient detection in our conditions of Southern blot 
20 hybridization. The smallest fragments qualifying as 

readable in this study, were in the 1 Kb size-range. 
The latter factor is particularly apparent with the two 
used four-cutters, Mspl and TaqI, whose expected mean 
fragment length are the lowest (1747bp and 1179 bp 
25 respectively) despite the presence of the rare CpG 

dinucleotide. Only about 15% of the expected number of 
fragments are detected for these two enzymes. 

Nevertheless, as much as 82 of these 96 cosmids, or 
85%, were showing at least one polymorphism within our 
30 sample of 9 randomly selected individuals. The detected 

polymorphic events are classified into two groups: 1) 
Point Mutations ("PM"), whenever a defined polymorphic 
pattern is only seen with a single enzyme, and 2) 
Insertions-Deletions ("ID") , whenever such a pattern is 
35 seen with two or more enzymes. Following these rules, 

we identified 215 polymorphic events, or a mean of 2.6 
independent RFLPs per cosmid probe. 162 of these 
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(75.3%) were considered of the PM type, the remaining 53 
(24.7%) of the ID type. Table 5 summarizes these 
results • 
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TABLE 5 

Rnvitie Multisite Hapl otvpes 



Hft!tE »" HI EC0RI " 

HSBT001 ™ { ™ 

?/3 



HSBT005 2PK(5) 
6/7 
3Plit5i 
M/7.8 



1IDU7) 

4,2+2.9/3-9+3,2 



1IDC17J 

9.5+1.5/9.1+1.6 



HSBT0S7 1IDC17) 
3/H 

HSBT0O9 



KSETODfl 



1PK(28) . 
?/5.B 



SEBT011 3IB139) 
7.8/5 



H3BTC13 



IPX (22) 
7.7/E 
2Plt(39> 
5.5/4 



HSBT015 imW 
14/13 



1ID122) 
9.5/10 
2PH<6) 
?/3.7 



2PK(23) 
25/22 



1IDC44) 
8.5/6.5 



KSBTOU 



KSBT017 
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table 5 (Continued) 
mi. Hmdlll Kpnl Kspl PstI PvuII 



MSBT019 ??(??> tmm 

20/17+2.B 

2PK(47) 

30+5.5/36 



4.3/5.5 



KSBTOOi 2ID155) 
9/12.5 


110(11) 
?/B.B 
2ID155) 
12/12.5 




??(??) 


HSBT0C5 


1ID(17) 


1IDU7) 


1IDU7) 
5.8+5/5.4+4.8 
4PK(50S 
4/7.B 


M5BT007 110(17) 
12/6.B 


1IDU7) 
8.2/9.5 






HSBT009 


1PN(44> 
3.9/5.7 




2PM (46J 
4.5+5.1/9.5 


MSBTOOA 


2P!?(2B) 

5/6.5 

3PHU1) 

3.8/2. B 

4PH<5) 

11/9.3+l.B 







HSBT011 4PKC22) 
16/14 



HSBT013 HB(22) ?»(??) 

19/16 



WO 92/13102 



PCT/US92/00340 



-64- 

table 5 (Continued) 



HAKE Hindlll 


Kpnl 


Hspl PstI 


PvuII 


R5BT0I5 




??(??) 




K5BT016 




1IDC40) 
5.B+4.3/6.5 




H5BT0I7 


KSBT0I9 


3PKC50) 
11/6.4+4.3 


?? (??) 





H5BT020 2PM (22) 
15/10 



WO 92/13102 



PCT/US92/00340 



-65- 



NAKE TaqI Ibal 


MET. 


KSBT001 ??(??) 


55 


HSBT001 


0 


HSBT005 1ID(I7) 


54 


4.2+2.2/3.9+2 




HSBT005 3PK(5) 


C 


14/6 




MSBT007 


17 


M3BT009 


77 


KSBTOOA 


50 


SSBTOCfl 


0 


KSBTOOA 


0 


HSBT011 3ID139) 


67 


5.3+2/ 




KSBT011 


0 


4.4+4.6+2.9 





table 5 (Continued) 



HSBT013 1IDC221 
5.5/6 

KSBT013 




33 
0 


HSBT015 1ID(44) 

12/5 
HS3T015 3PHC33) 

8.3/10.B 




66 
0 


HSBT016 1ID(40) 
6/3.3+2.5 




50 


(BBT017 




0 


JBBT019 4PNC44) 




89 


HSBT019 5PM (38) 




0 


RSBT020 


3PKC35* 
30/25 


66 



WO 92/13102 



-66- 



PCT/US92/00340 



NAME BatHI 



KSBT022 1PHC47) 
3-3/3.5 



Bgll 



table 5 (Continued) 

Bgill EcoRI EcoRV 



mm 

5.B/2.5+3.2 
3PK(5) 
5.8/5.2 
4PM (B) 
V2A 



RSBT023 



KSBT024 



1ID126) 
14/7.5 



1ID(261 
10/10-7 



HSBT025 



lPHtil) 
9.6/7.5+2.2 



2PIU33) 
12/9-7 



K5BT026 






l??(??5 
8/? 


1??(??) 


HSBT02B 


1HK371 
5/6.3 
2ID(??J 
3/? 


113(37) 
14/10 
2ID(??> 
15/? 


1IDI37) 
13/6.6 


1IDC37) 
8.1/5.5 
2ID(??> 
3.4+5/? 


ltSBTC30 1PK(6?J 










8/5.6+2 










RSBT031 








??(??* 


HSBT032 1PK(22) 
9.6+6/16 






2IDtU) 
?/6.2+7+l2 


2ID(1U 
?/6.1 


H5BT033 






lPfldll 

3.5/5 

2ID(44) 

5.8/7 

3PM (44 J 

9/12 


2ID(44) 

6/7.8 

??(??) 



IBBT034 



WO 92/13102 PCT/US92/00340 

-67- 



table 5 (Continued) 



MAKE Hindlll 


Kpnl 


Hspl 


PstI 


PvuII 


HSBT022 


HSBT023 








1PK(45) 
5*5+3.7/3.2+2 


HSBT024 JIB i2b) 
?/7.B 




1IP(26) 
7.5/5 




1ID(2A> 
13/14 


W5BT025 ??(??> 


H5BT026 ? 




2??t??) 






HSBT028 


HBBT030 








3??(??1 
3,3/3.1 


HSBT031 ??(??) 




??(??) 




??(??) 


KSBT032 2IDC111 
?/12 
3PH(33> 
4.6/4.9 
4ID(56) 
4/3.4 








2ID(ii) 
?/2.7 


KS6T033 2ID(44) 
14/15 

??C??J 


4X0(66) 
A.5/2.5 


4IDC&6) 
1.7/4.3 




??(??) 


HBBT034 


1ID(44) 
5.9/3.9 


??(??) 




1ID<44) 
4.7/4.3 
2PHC22) 
5.3/4 



WO 92/13102 



PCT/US92/00340 



table 5 (Continued) 



NAME TaqI 


Xbal 


HE7. 


MSBT022 5PH(44) 

?/3.2 
HSBT022 ??(??) 




89 
0 


HBBT022 




0 


HSBT023 


2P1U33J 
17/13 


45 


N5BT024 1ID(26> 

14/15 
HSBT024 ??(??) 


110(26) 
3.B/4.7 


26 
0 


HSBT025 3PH(37) 
9/11/iC 
HSBT025 4PH(??J 




78 
0 


HSBTC25 




0 


HSBT02B 




37 


KSBTC2E 




0 


HSBT030 2PKC11) 
5/5-2 




7B 



WSBTC3: WW) 



R5BTC32 4ID(56) 
4.1/4.3 

H5BT032 
HSBT032 



HSBTC33 5PH<44> 2ID<44> 
4/3.4 5/6.9 

HSBT033 

??(??) 

HSBT033 



BB 
0 
0 



KSBT034 II0CM) 

5.B/B 
RSBT034 3??(??) 



56 
0 



WO 92/13102 - 69- 

table 5 (continued) 
NAME BaiHI Bgll Bglll ' EcoRI 



HSBT035 HB«55) 

6.3/6.6 



HSBT037 




3??Ui) 
2??(11) 






HSBT039 5PNI??) 


ilD(ll) 
6.6/10.5 


HD(ll) 
15.0/14.0 


110(11) 
7.5/6.4 


110(11) 
7.5/6.4 


HSBT039 lPH(li) 
9/9.6 


HSBT040 




1PE(22) 
16/16 






HSBT041 


HSBT042 1PK(331 
14/15 
2PK(33) 
6/2 


RSBT043 


09(9? J 


110(44) 
4.5/4.7 
210(55) 
3.7/4.3 


110(44) 
3.2/3.6 




WSBT044 


HSBT045 




110(44) 
4/3.4 


210(69) 
3.9/6.5/3.5 


210(69) 
4/4.8/3.5 


K5BT046 






110(78) 
3.5/3.6 


110(76) 
3.5/3.6 



HSBTM7 1PHC11) 
4.3/7.3 
2PM30) 
?M.5 



PCT/US92/00340 



EcoRV 



WO 92/13102 



-70- 



PCT/US92/00340 



m£ Hindlll 



BSBT035 



HSBT037 



Kpnl 



table 5 (Continued) 
Kspl PstI 



2IDt??J 



1P«(22) 
4.5/5 
4??(??) 
3.8/? 



PvuII 



2ID(??J 



HSBT03B HD(li) 
9.6/8.5 



HSBT039 



HDtll) 
6.0/5.5 



2PH(37> 
19/2.7+16 

4W (90) 

4.3/? 



HD(il) 
6+8/2 



KSBT04G 



2??(??J 



KSPT041 



K5BT042 



H5BT044 0 



HSBT045 11B(44) 
11/10.5 



BSBT046 



3PHC44) 
7/? 



??{??) 



1ID(44) 



H5BT043 1HH441 
6.5/6.9 
3PH(7B) 
4.8/3.3 


5PHC33) 
2.9/3.1 
61D(33) 
3.7/3.9 


6IBC335 
?/2+2.i 
7PK(li> 
?/2.4 


4PKC7B) 






4.8/4.6 






9PH(??> 






3/2 







1ID(??) 
3.85/3.9/3.95. 



HSBT047 



3PK(11> 
6.5+2.8/9.3 

mm 

19/14 



WO 92/13102 



-71- 



PCT/US92/00340 



table 5 (Continued) 



NAME TaqI 


Xbal 


HET. 


HSBT035 115(55) 
7.B/5.3 




55 


MSBT037 




22 


HSBT037 




0 


HSBT03B ilD(ll) 
5.5/? 

H5BT03B mm) 
3.5/? 


HD(li) 
6.6/5.6 
fi??(??) 


44 

0 


KSBT039 




11 


KSBT04C 




22 


KSBT04I 




0 


KBBT042 ??(??) 


??(??) 


44 


RSBT042 




0 


H5BT043 m(li) 
3.3/2.9 

HSBT043 2IDC55) 
1.4/2.2 

HSBT043 




99 
0 
0 


H5BT043 




0 


K5BT044 


/ 


0 


HSBT045 3PKI33) 
U/12 




69 



H5BT046 2WH33) 78 
4,9/5.2 

HSBT046 1IIK7B) 0 
2.9/1.7 



HSBT047 5P*(??> 
14/21 

KSBT047 



44 

0 



WO 92/13102 



-72- 



PCT/US92/00340 



table 5 (Continued) 



«AHE BaiHI 
KSBT048 



Bgll 



BglH 



EcdRI 



EcoRV 



BSBT049 



KSBT050 



iID(22) 
9/3.9+5.3 

iP!U??> 
?/8.2 



11DC22) 
9.7/4 



BSBT051 2PK(22) 
20/16 



IPR(ll) 
7/14 




H3BT056 



KSBT057 



HSBT059 



IID(55) 



110(551 
5.6/7 
2PH122) 
12/5.5 



llD(U) 
?/B 



1IDC33) ' 
l.B/3.4 



1ID1551 
5.7/7.3 



ilD(ll) 
?/B.5 
2PM??> 
7/6-7 



1IB(33> 
?/3.2 



HSBT060 



HD(ll) 
6/5.B 



1IDC11) 
5/6.5 



IID(U) 
9.6/11 
2ID155) 
6.4/9.6 



llDtll) 

9-6/11.5 

2IDC55) 

3.5/6.6 

3PHC22) 

9.6/B.B 



KSBT061 



WO 92/13102 



-73- 



PCT/US92/00340 



table 5 (Continued) 



NAME Hindlll 


Kpnl 


Mspl 


PstI 


PvuII 


iMfiwA * #* « lift 

HSBT048 110(44) 
6.2/5.9+3 


4 1l\ / 111 

1ID(44) 
27/15+7,8 


i tiwjai 
11C144J 

4.3/3.7+3.2 

2PHC11) 

5.5/4.6 




1 infill 

6.5/9 


HSBT049 2PKU1) 
8.4/7 










H5BT050 2PH(??» 
?/1.5 


2ID(33) 




jrnill) 
3.2/2.B 


Z1LH00J 

2+1.6/6.8+l.B 


HSBT051 




3Pn(??l 






M5BT052 




1IDI??) 




HD(ll) 
12/10.5 


HSBT053 






2ID(44) 
?/6.8 


2IDC44) 
2.1/2.4 
3PK(??> 
11.5/11 


KSBT054 






1P«(22) 
2.6/2.9 




MP TIT AC Z 








1ID(55) 
3/2.7 


HSBT057 










KSBT059 




1ID(33) 
12/11 






HSBT060 HD(ll) 
6.2/6.8 






4PM(??) 
2.7/1.4 





HSBTOfci 1IDI44) 
7*2.4/9.3 



110(44) 
4.7/5.2 



WO 92/13102 



-74- 



PCT/US92/00340 



table 5 (Continued) 



MAKE TaqI 


Xbal 


HET. 


KSBT04B 
HSBT048 


1ID(44) 
6.4/7 


55 
0 


KSBT049 




22 


HSST050 2ID(33) 

2.5+2.7/2.4+2.9 




re 
DO 


nSBTOSl 4PHC22) 
5-2/6 

USBT051 5PH(33> 
3.3/3.1 




78 
0 


H3BT052 HD(ll) 
B/7.B 




11 


HSBT053 ?? 




44 


HSBT053 




0 


HSBT054 2PHU1) 
4.6/7.4 


3PHC33) 
3.7/4.8 


33 


KSB7056 1ID155) 




66 



2.5/4.7 

KSBT056 



R5BT057 3PKC1D 4PMW 

14/9.5 20/15.5+4.5 

HSBT057 



HSBT059 2PH(66) 
7.6/6.6 
RSBT059 3PH(??) 



KSBT060 
RSBT060 
ISBT060 



HSBT061 



44 



WO 92/13102 



-75- 



PCT/US92/00340 



t able 5 (Continued) 



NAME BatHI 


Bgll 


Bgll I 


EcoRI 


EcoRV 


MSBT062 








??(??) 


MSBT064 


MSBT0&5 




1ID(44) 
4.5/7 






HSBT067 








11BC55) 
9/7.3 


HSBT068 HD(li) 
18/11 






HD(ll) 
14/12 


HD(ll) 
10/B.8 


K5BT069 2PHC33) 

16+3.5/19.5 




1PKU4) 

12.5/10.5 


3PMUU 
?/9 


4PM (22) 
9/B.4 . . 


HSBT070 


lPH(ii) 

?/3 




210(55) 
2.5/2,9 


2IDC55) 
2.7/3.2 


MSB7071 


HSBT072 




1ID(33) 
13/12 






HSBT074 


WSBT075 


H5BT076 210(44) 
6.2/4.B 




1I0C33) 
2.5/2 


2ID(44) 
2/2.8 


2ID(44) 
2.3/2.9 


MSBT078 






lPH(ll) 
7.6/6.7 


2PMC11) 
7.8/6.7 



HSBT079 ??(??) 



MSBT080 HD(ll) 

4.5+2/5.7+4.3 



RSBT0B1 



??(??> 
?/ll 



WO 92/13102 



-76- 



PCT/US92/00340 



mi Hindlll 



HSBT0A2 



HSBT064 



HSBT0&5 



WSBT067 

HSBTO&e 
KSBT069 



HSBT07 



HSBT072 2PMC11) 
9.5/B.2 

KBT074 mm) 

4.6/5.3 



Kpn 



t able 5 (Continued) 
[ Mspl PstI 



1PM (33) 
25/19 



1ID(U) 
5.2M 



1ID155) 
3.9/3.5 



KSBT070 3PRC22) WIU" 
9.4/7.9+1-5 13/10.7+2 



1IDE33J 
3.7/4 

2PIU44) 
10.B/12/9.5 
3PM 133) 
17/16.5 
4PM44) 
10.B/12/9.5 



3PM(??) 
?/5.4 



PVElII 



5PH(33J 
?/3.2 



5P«(44) 
8.7/9.5 



1P«(??J 
?/B 



4PH(44) 
5.9/9 

5PK(33; 
3.3/5.7 



M5BT075 

M5BT076 

HSBT07B 



KSBT079 



— 1IDC11) IBllU 

HSBT080 UW2.B 



HSBT081 



2PH(33) 
2.2/2.7 



WO 92/13102 



-77- 



PCT/US92/00340 



table 5 (Continued) 



NAKE TaqI Xba! HET. 



HSBT062 ??(??) 




0 


nSBT064 




33 


HSBT065 110(44) 
6.7/4-7 


110(44) 
4.8/3.1 


44 


HSBT067 


mm 


55 


H5BT06B 2PKI33) 
3/1.6 




44 


HSBT069 




B9 


HSBT07C 6Pf!(??) 
?/3.7 




7B 


H5BT071 2PKU1) 
15/11 




11 


HSBT072 


5PMH) 
20/12 


78 


HSBT074 




66 


H5BT074 




0 


KSBT074 




0 


HSBT075 1PHU1) 
3.2/4.4 




11 


KSBT076 1ID(33) 
2/3.8 




44 


HSBT078 


4PIH33) 
4/6.4 


44 


H5BT079 
HSBT079 


iPIUll) 
11/13 
2PH133) 
3.5/3.7 


44 

0 


ttSBTOBO 


2PKU1) 
6.3/3.9+2.5 


22 



K5BT0B1 



33 



WO 92/13102 



-78- 



PCT/US92/00340 



table 5 (Continued) 



MAKE BaiKX 



HSBT0B3 2PK(44) 
ll/i? 



H5BT0B4 IPK(44> 
4.5/9 



Bgll 



BglH 



EcoRI 



1PKCU1 
?/4.5 



K5BTOB5 



KSBT0B6 2ID122) 

10.5+2.2/12.5 



1ID(44) 
7.7/6.5 
2ID(22) 
3.3/4.B 



3IDC33) 
5/3.8 



lPM(ll) 
6.2/6+9.4 



EcoRV 



3IDC33) 
5.5/3.9 



3ID(55) 
5.5/4.8 
2ID(22) 
6/4.5 

3PHU1) 
?/3.7 



HSBT0B7 



HBBTOBE 



HSBT0B9 



RSBT090 
HSBT091 
KSBT092 



H5BT096 



99(9?) 

5.5/6 




KSBT097 



HSBT098 



WO 92/13102 _ 79 _ PCT/US92/00340 





TABLE 


5 (Continued) 




NAME HindHI 


Kpnl 


ttspl 


PstI 


PvuII 




KSBT083 miW 
VI 










HSBT0B4 






2PKC221 
5.1/3.9 




HSBT085 2IDCZ2) 
12/5.7 


HSBT086 




2IDC22J 
IB/21 
410(55) 
4.3/7 


4ID155) 
2.5/5 


5PHC55) 
7.1/5 


H5BT007 


R5BT08B 


HSBT089 


«SBT09C 


HSBT091 






3ID(33) 
?/4 




HSBT092 mW) 
4,3/2.8 




2WU22) 
3.5/4 






HSBT093 






1PHI33) 
4.7/1.9*2.5 
2PHC33) 
?/3.3 




RSBT094 


KSBT096 


lPH(il) 
8/4.B+3.2 









HSBT097 1PK(??) 



HSST098 



IPM66) 2PH(??1 



WO 92/13102 



-80- 



PCT/US92/00340 



table 5 (Continued) 



NAHE TaqI 



Xbal 



het. 



KSST0B3 5PKU1) 
?/4 



R53T0B4 3PRC33) 
2.3/2-4 

HSBT0B4 4PH(??J 
?/2.B 



HSBT0B5 3ID(55> 

5/3.B 
HS6T085 1ID(44) 

mj 

KSBT086 4IDC55) 
6/3.3 

H5BT0B6 



2ID122) 
11+15/23 



66 



55 
0 



55 
0 

7B 
0 



HSBT0B7 



BSBTOBB 



I^BT0B9 ° 


HSBT090 1PHU1) 
1.8/3-5 




11 


H3BT091 3IDC33) 
?/5.B 




33 


KSBT092 3PIU33) 
5.3/6.B 




66 


«5BT0?3 




44 


OT093 




0 


HSBT094 




0 


IBBT096 




11 


HSBT097 




0 

1 



HSBT098 3PR155) & ml) 
4.5/4 tB'W 



WO 92/13102 



PCT/US92/00340 



-81- 



table 5 (Continued) 



NAME BatHI 


Bgll 


figlll 


EcoRI 


EcoRV 


.1SBT09B 


HS8T099 1PH(33) 
14/15 






2:d(id 

10.5+4/15+3.9 


2IBC11) 
B+4/B.B+3.9 


H5BT100 


MSBT101 2PJK22J 
B/10.2 


1PM44) 
9+7/16 








HSBT103 


HSBT1M 


KSBT1C5 






1PM??/ 




KSBT106 1ID122) 
?/9.5 






IIDC22) 
?/8.3 


110(22) 


HSBT107 2PM(67) 

12+2.4/14.4 


1IB(33) 
2.6+2,8/3.1 








HSBT108 2PKtll) 
9.5/7.B 




iPH(ii) 

15/6.5 






HSBT109 


HS3T110 








1PK<44> 
10.3/9.1 


HSBTUi 






lPff(ll) 
?/3 




BSBT113 






1PM22) 
10/9.5 




HSBT114 HD(33) 
?/4.4 






110(331 
8.8/8+1.9 


110(33) 
5.5/4.8 



WO 92/13102 



-82- 



PCT/US92/00340 



HSBT103 



IfSBTlOA 



HSBTiOE 



table 5 (Continued) 



MflHE Hindi! I 


Kpnl 


Hspl 


PstI 


PvuII 


KSBT09B 


HSBT099 


2IB111) 
19/21 

mm) 

10+9/19 


2IDC11) 
2.1/1.9 


2IDU1) 
4.9/4.8 
4PH(??) 
?/4.2 




RSBT100 


lPM(il) 
8/U.5 








KSBTI01 




??(??) 




3PHC55) 
3.55/3.6 



RSBT1C5 



K5BT106 1IM22) 
?/3.7 

KSBT107 



2PKI??) 



3??(??) 



KSBT109 1ID155) 

4.9+5.8/10.6 

HSBT110 



1ID155) 
4.6/3.4 



nsmn 



IBBT113 OT(33) 3IDI111 

7.5/3.4*4.2 «W 

3IDU1) «K5» 

4/7*3.8 12+21/35 

HSBT114 IIDC33) 
?/2.5 



1ID133) 
11/10 



2ID122) 
2.9/3.2/2.B 



3ID111) 
7.5/9.2 



1ID(33) 1»<33» 
5.3/5.9+1,6 ?/4 



WO 92/13102 



-83- 



PCT/US92/00340 



table 5 (Continued) 
NAKE TaqI Xbal HET. 

MSBT09B 4PHU1) 0 
2.5/2.B 

HSBT099 2ID(iU 78 
4.9/4.B 

HSBT099 5PMC22J 0 
2.5/1.5 



HSBT100 




11 


HSBT101 
KSBTiOi 


4PKC33) 
18/5/7.3 
5PK(33) 
18/5/7.3 


66 

0 


KSBT103 0 


MSBT104 




0 


HSBT105 




0 


HSBT106 




22 


HSET107 1ID(33) 
2.1/2.6 




7B 


KSBT10B 


3PHC50) 
7/11.5 


55 


HSBT109 


2PM(??) 


55 


WSBTllO 


2P«t??> 


. 44 


HSBT111 3IDC11) 
7/8.8 

NSBTHl 4PHU1) 
3.7/4.9 




22 
0 


M5BT113 3IDC11) 
?/7 

HSBT113 4IDC55) 
6.5/6.9 




66 
66 


HSBT114 


1IDC33) 
?/4.9 


55 
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table 5 ( Continued ) 



HAKE BaiHI 


Bgll 


BglH 


EcoRI 


EcoRV 


HSBT114 






2PMC22) 
?/3,5 




K5BT116 



MBBT120 



1PHU1) 
?/7 



HSBT121 



1FHC33) 2P!1(33) 
5+5.B/10.B 5/5.4 
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NAME Hindlll 



Kpnl 



Hspl 



PstI 



PvuII 



SSBTH4 3PKC22) 
6.7/10.6 



4PM (55) 
5/4.?;?. 2 
5PM (55) 
5/4.7/7.2 



MSBT116 



??(??) 



1PM (33) 
6.5/5.5 



H5BT119 1ID(55) 
11/5.2 



HSBT120 2PHI44) 
7.5/6.1 



MBBT121 
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table 5 (Continued) 



NAME TaqI 


Xbal 


HET. 


rtSBTlH 




U 


H5BT114 




0 


H5BT1I6 




33 


RSBT119 iID(55) 
4.2/2.B 

HSBTI19 2PH(44) 
6/7.7 




67 
0 


HS8T120 3PH(W 
5.2/3,7 




55 


K5BT121 


3PHC44) 
3.4/5.2 


67 
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Al though this way of classifying RFLPs gives a 
conservative estimate of the number of identified 
polymorphisms, for cosmids characterized by strong 
linkage disequilibrium, the number of ID events may be 
5 inflated at the expense of the actual number of PMs. To 

compensate for this, we performed a second set of 
calculations in which a polymorphic event must be 
detected with at least 3 enzymes to qualify as ID, 
RFLPs previously attributed to insertion-deletion events 

10 because detected with two enzymes, are now considered as 

two independent point mutations. 24 polymorphisms 
initially considered IDs fell into this category. 
Following this approach, 239 independent RFLPs were 
identified or 2*9 per cosmid, with now 87.9% of the PM 

15 type and 12.1% of the ID type. 

Table 5 reports the observed heterozygosities 
obtained with the generated multisite haplotype systems. 
These values correspond to the percentage individuals 
heterozygous for at least one of the polymorphisms 

20 identified with a given cosmid. Noteworthy, this 

parameter is not affected by the mode of classification 
of RFLPs in PMs or IDs. At this point and without 
segregation information, we can't dissect the 
heterozygous genotypes into their component haplotypes. 

25 These heterozygosities were estimated on a small sample 

and should therefore be considered cautiously. As 
pointed out by Skolnick and White (129), the main 
advantage of working with a sample of 9 individuals is 
that it is sufficient to identify the majority of useful 

30 polymorphisms. However, the mean heterozygosity of 

51.9%, obtained over the 84 polymorphic cosmids 
demonstrates the power of the approach. 

The following numbers of RFLPs were detected by 
each enzyme, irrespective of PM or ID type: TaqI: 57, 

35 EcoRI: 37, Mspl: 33, HindlH: 31, PvuII: 30, EcoRV: 29, 

BamHI: 26, Bglll: 24, Xbal: 21, PstI: 18, Kpnl: 16 and 
Bgll : 13 . The number of polymorphisms detected with 
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Bglll and PstI were corrected to adjust for the lower 
number of probes used with these enzymes* 

Proper interpretation of the polymorphic patterns 
has been confirmed by segregation analysis in pedigree 
5 material for most of the described RFLPs (see 

hereafter) . 

RFLPs of the PM type were used to calculate 
nucleotide diversities as described in Materials and 
Methods. Two sets of values are reported, depending on 
10 which of the two criteria were used to classify an RFLP 

into the PM or ID type. Global nucleotide diversities 
of respectively 0.000652 and 0.000846 were obtained, 
meaning that a randomly selected Holstein animal will be 
heterozygous for approximately 1 every 1200 to 1500 base 

15 pairs. As expected because of the presence of an 

hypenautable cytosine followed by guanine in their 
recognition sequence, nucleotide diversities more than 
twice as high are obtained when combining data obtained 
with the enzymes MspX and Taqi: 0.001493 and 0.002239 

20 respectively (5) . On the other hand, the recognition 

sequence of the enzymes Bglll, Hindlll, PstI, PvuII and 
Xbal are devoid of hypermutable cytosines in the CpG 
dinucleotide and yield combined nucleotide diversities 
less than one third the values found with Mspl and TaqI 

25 (0.000492 and 0.000648). Using these two sets of 

values, one can extrapolate what nucleotide diversities 
would be obtained if sampling hypothetical sequences 
composed entirely of hypermutable cytosines, giving 
respectively 0.004496 and 0.007012. Assuming that the 

30 majority of detected polymorphisms behave according to 

the neutral mutation-random drift hypothesis, nucleotide 
diversity and mutation rate are simply related as: 

7T « 4Ne/i, 

where Ne stands for the effective population size and /i 
35 for the mutation rate (1) . Therefore, our data allow us 

to estimate that cytosines followed by guanines mutate 
at a rate approximately 10 times higher compared to 
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other nucleotides , presumably because a substantial 
fraction of these are methylated in the germline and 
prone to mutate to thymidine by spontaneous deamination. 

DISCUSSION 

5 We demonstrate in this work that large numbers of 

DNA markers with very acceptable Polymorphism 
Information Content can be quickly generated using 
large, randomly selected genomic probes in Southern blot 
hybridization experiments . The multisite haplotypes 

10 identified in this study using cosmid probes have a mean 

heterozygosity of 51.9% This value is of the same order 
of magnitude as the heterozygosities that we have 
obtained with a panel of approximately 40 bovine 
Variable Number of Tandem Repeat markers (mean 

15 heterozygosity 59%; 150) , and with more than 50 bovine 

(TG) -dinucleotide microsatellites (mean heterozygosity 
56% ; unpublished) • 

A remarkably high proportion of the tested cosmids 
proved informative: 74.5% of all tested clones, and as 

20 high as 85% when considering only the clones giving 

readable patterns. Compared to strategies aimed at 
isolating hypermutable sequences such as mini- or 
microsatellites, very little time and effort is wasted 
into candidate clones which have to be dropped at a 

25 later stage. 

Because the cosmid clones used as probes are 
selected at random, we can reasonably assume that the 
coverage obtained with the generated markers is fairly 
uniform. Monte-Carlo simulations allow us to estimate 

30 that these 82 markers are covering 29%, 47% or 60% of 

the bovine genome in linkage studies if a maximum of 
respectively 5, 10 and 15 cM are scanned on each side of 
each marker, and assuming a total bovine map length of 
25 Morgan as deduced from chiasmata counts (151) . 



WO 92/13102 



PCIYUS92/00340 



-90- 



10 



The Southern blot hybridization procedure used for the 
detection of these RFLPs is a very mature and robust 
methodology, allowing the treatment of very large 
numbers of samples simultaneously, and benefitting from 
intrinsic "multiplexing" properties especially when 
using nylon membranes. Indeed, and despite variations 
between batches, we are routinely using membranes for 10 
or more hybridization cycles* 

The main disadvantage of multisite haplotypes is 
the requirement to use several restriction enzymes to 
fully exploit their PIC. This increases costs, amounts 
of required DNA, complicates the organization of 
genotype collection and their subsequent use in linkage 
analysis. 

15 The f ac t that 75 to 85% of the cosmids tested in 

cattle reveal polymorphism, compares favorably with 
results previously reported in the human. Schumm et al. 
(131) for instance report that 30% of the 1664 lambda 
clones they tested in a sample of 5 individuals, gave 

20 polymorphic patterns. Adjusting for the sample size and 

a ratio of approximately 2.5 between cosmid and phage 
insert size, the two figures are probably fairly 
similar. Surprisingly, only 54 of 101 human cosmid 
clones tested by the same group (152) were revealing 

25 RFLPs when tested with 9 restriction enzymes, versus 

74.5% in our study with, however, 12 enzymes. It has 
been speculated that the relatively low level of 
polymorphism found in this study might result from a 
biais against human methylated sequences (including CpG 

30 present in the recognition sequence of TaqX and Mspl) 

when construction the cosmid library/ due to the active 
modified cytosine restriction system (mcr) of the E.Coli 
1046 host (152) . 

These results are quite unexpected. Indeed, 

35 because of the population structure imposed by breeding 

strategies, the effective population size, Ne, in cattle 
is expected to be significantly lower than in the human. 
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In American Hoi stein, Ne is estimated between 10 2 and 
2xl0 3 (Ina Hoeschele, personal communication) . This 
value has to be compared with an estimated Ne of 10* in 
the human (18) . Assuming identical mutation rates, this 
5 reduction of Ne should be accompanied by a concomitant 

reduction in genetic variation. As a matter of fact, we 
obtain estimates of global nucleotide diversity between 
0.000652 and 0.000846 which are between 3.5 and 2.5 
times lower than values typically found in human 

10 populations (2, 119, 132). This confirms our previous 

results in another cattle population: the Belgian Blue 
Cattle breed (3, 4). 

At least part of the discrepancy between an 
apparently reduced nucleotide diversity but similar RFLP 

15 frequency, may be accounted for by the apparently higher 

frequency of insertion-deletion events found in cattle 
compared to human. Schumm et al. (131) report that 58 
out of the 515 polymorphic loci (11.26%), show 
insertion-deletion type RFLPs; other groups report even 

20 lower frequencies of such events in the human (R.White, 

personal communication) . . In cattle, we found that 29 
(35%) to 46 (56%) out of 82 polymorphic cosmids show 
such insertion-deletion events, depending on whether an 
ID-type polymorphism has to be detected with two or more 

25 enzymes. These results seems to point towards a 

fundamentally different property of both genomes. It is 
tempting to speculate that this high level of insertion- 
deletions in the bovine' genome reflect the activity of 
a mobile element. Analysis of the restriction patterns 

30 characterizing these ID events, however, does not reveal 

any typical, recurrent "signature" of such an element. 

Altogether, our laboratory has now isolated more 
than 200 DNA markers for cattle with a mean 
heterozygosity above 50%: 82 multisite haplotypes, 40 

35 Variable Number of Tandem Repeat markers (150) and more 

than 80 dinucleotide microsatellites (unpublished) . The 
coverage of the bovine genome obtained with increasing 
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10 



15 



number of randomly selected probes (0 to 500), was 
estimated by Monte-Carlo simulation assuming that the 
used family material is sufficent to detect linkage at 
respectively 5, 10 and 15 cM, andg a total bovine map 
length of 25 Morgan as deduced from chiasmata counts 
(151) , divided over the 30 bovine chromosomes according 
to their relative length. With the PIC characterizing 
our marker set, we feel fairly comfortable that in the 
majority of situations we will be able to cover genetic 
distances of the order of 10 cM or more, especially if 
applying multilocus or interval mapping techniques (66, 
68) . Therefore our panel of probes should cover around 
75% of the bovine genome. 

It is obvious that we are approaching a point where 
the efficiency of a strategy based on the further 
accumulation of random markers become questionable. 
After 200 probes, the additional coverage obtained per 
new marker, expressed as a fraction of the maximum 
coverage possible, now is approximately 1/5 of the 
coverage obtained when we started this project. This 
creates the need for more targeted approaches. In this 
regard, mappers of domestic animal genomes will benefit 
from the human mapping efforts and the remarkable 
chromosomal conservation observed within mammals (133) . 
Based on comparative mapping information, it should be 
possible to identify genes likely located in the -holes" 
left by the random approach and to generate multisite 
haplotypes or microsatellite markers around their bovine 
homologies. We are presently exploring the feasibility 

30 of such approaches. 

With the markers available today, a substantial 
part of the bovine genome is now amenable to linkage 
scanning, which will hopefully allow the mapping of 
Economic Trait Loci and testing the feasibility of 
Marker Assisted Selection schemes. 
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CLONING, CARACTERIZATION AND "IN VITRO" AMPLIFICATION 
OP BOVINE MICROSATELLITES 

INTRODUCTION 

5 Recently, microsatellites were proven to be an 

abundant source of highly polymorphic markers in the 
human (32-34). As their name implies, microsatellites 
are minute VNTR markers (18-20) , characterized by tandem 
repetitions of very short repeats, one to four base 
10 pairs in length, Microsatellites exhibit levels of 

polymorphism comparable to VNTRs, but are much more 
abundant and apparently evenly spread throughout the 
genome. 

We describe the cloning, caracterization and "in 
15 vitro" amplification of more than 100 such bovine 

microsatellites. 

MATERIALS AND METHODS 

l. DNA Database search 

20 Bovine and ovine sequences in the EMBL and Genbank 

(version 64.0) were searched for all types of dinucleo- 
tide and trinucleotide repeats using the InteMigenetics 
software, release 5.37. The minimum number of repeats 
was set at six. Six bovine sequences, characterized by 

25 the longest microsatellites, were retained for further 

analysis and are listed in Table 6. 

TABLE 6 

GBIRBP 
GBKCAS 
GBPRLGR 



GBCYP21 
GBFSH 

30 GBGAPR 
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2* Isolation of Bovine Hicrosatellites 

Bovine genomic DNA was digested to completion with 
Mbol and size-fractionated by agarose-gel electrophore- 
sis. Fragments between 250 and 500 base pairs were 
5 recovered and purified using "Gene~Clean" , ligated into 

the BAP-dephosphorylated BamH I site of pUC13 (Pharma- 
cia) , and cloned into £. Coli DH5a cells (BRL) . The 
resulting clones were screened for the presence of (TG) n 
microsatellites using a 32 P kinased (AC) ^ oligonucleotide 
10 as probe r and for (AG)n microsatellites using a (TC)15 

probe. The library was made with female DNA to avoid 
the previously characterized Y-specific TG-rich bovine 
DYZ1 sequence (117) . 

3. Sequencing of Bovine Microsatellites 

15 Positive clones were sequenced using one of the 

following procedures: 

(a) Plasmid DNA was prepared using standard "boil- 
ing mini-prep* 1 procedures and subjected to two chain- 
termination sequencing reactions using unmodified T7 DNA 

20 Polymerase (Pharmacia) , with the "universal" and "re- 

verse" sequencing primers , respectively. The 35 S labeled 
sequencing products were analyzed on standard denaturing 
polyacrylamide sequencing gels and detected by auto- 
radiography. 

25 (b) Magnetic solid-phase sequencing (137) . Alterna- 

tively, positive colonies were grown in microtiter- 
format using standard procedures in order to establish" 
glycerol stocks and 5/il of culture directly subjected to 
PCR amplifications using the following vector-specific 

30 primers: 

UNIBIS : 5 ' -GATGTGCTGCAAGGCGATTAAGTTG-3 ' 
REVBIS : 5 ' -CGGCTCGTATGTTGTGTGGAATTGT-3 ' 



Two 30 cycle amplifications were carried out per clone, 
one with the UNIBIS primer biotinylated, the other with 
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the REVBIS primer biotinylated. Denaturation was at 93 *C 
for 1 min. (except for the first cycle: 95 *C for 5 min.) 
annealing at 60*C for 2 min., and extension at 72 *C for 
2 min. All of the PCR reactions were performed in micro- 
5 titer-format using the TECHNE MW-2 heating device. The 

biotinilated strand of the PCR-product was captured 
using the DYNAL streptavidin-coated magnetic beads 
according to the manufacturer recommended conditions and 
sequenced using unmodified T7 DNA polymerase (Pharmacia) 
10 as specified by the manufacturer. 

4. Amplification and Detection 
of Bovine Microsatellites 

(a) Simplex Am plification. The generated sequences 

are organized in the following way: 

15 5'-.... (UP) (TG)„ -3' 

3'-.... (AC) n (DN) -5' 

Suitable primers for in vitro amplification are identi- 
fied in "UPSTREAM (UP)" and "DOWNSTREAM ( DN ) " strands 
using the "OPTIPRIM ,l program designed by one of us. 

Given sequence information flanking a DNA stretch, 
"Optiprim" helps the user to identify suitable primer 
pairs for PCR amplification of the corresponding DNA 
stretch. 

pescriotion of the program : The two DNA sequences 
flanking the DNA stretch to be amplified are referred to 
as the upstream (UP) and downstream (DN) sequence, res- 
pectively. Both for UP and DN, Optiprim tests all 
possible primers of given length (as defined by the 
user) and retains the primers (1) with melting tempera- 
ture (Tin) within the range defined by the user (Tm is 
calculated as 2C x number of As, or Ts + 4C x number of 
Gs or Cs) , (2) with a minimum percentage of each nucleo- 
tide as defined by the user, and (3) which cannot form 
secondary bonds that can be formed between two molecules 
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of the defined primer when sliding them in antiparallel 
orientation against each other, as illustrated in the 
following: 

5'-PRIMER-3' > 

< 3'-REMIRP-5' 



An A facing a T contributes two hydrogen bonds, and 
a G facing a C contributes three hydrogen bonds. No loop 
formation is considered when performing this analysis. 
This generates two sets of selected primers: an UP set 
10 and a DN set. All possible pairs of one UP and one Dn 

primer are then tested, Optiprim retains the primer 
pairs if (1) the difference between melting temperatures 
of the two primers is within a range defined by the 
user, (2) the two primers cannot form secondary struc- 
15 tures, determined as for individual primers, except that 

now the UP primer is slided versus the DN primer. Using 
this program, 80% of the selected primer pairs were 
giving successful PCR amplification in our microsatel- 
lite systems. The following criteria are considered by 
20 ,, OPTIPKIM ,, when searching for primers: primer length, 

melting temperature and secondary structures that can be 
formed within and between primers. The selected primers 
are synthesized by phosphoramidite chemistry on Applied 
Biosystem synthesizers and used without further puri- 
25 fication. The microsatellites are amplified in vitro, 

in microtiter plates and using the Techne MW2 device, in 
the following conditions (typically, 30/il reactions) : 
Target DNA 50 ng-lOOng 

KC1 50mM 
30 Tris-HCl, pH 8.4 10mM 

MgC12 1.5mM 
Gelatine 0 . 01% 

dNTP 200/iM each 

Primers 1/iM each 

35 dCTP 32 2/iCi/30/il 

Amplitag lU/30/il 
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Thirty cycle amplifications are performed, characterized 
by a 93 - C denaturation for 1 min. (except for the first 
cycle: 95*C, 5 min.), annealing at 55*C, 60*C or 65 # C for 
2 min- depending on the primers, and extension at 72 'C 
5 for 2 min. Annealing temperatures are reduced by 5*C 

when using bovine primers on ovine target DNA. 

(b) Multiplex Amplification . When performing multi- 
plex amplifications, concentrations of KCL, Tris-HCl, 
MgCl2, gelatine and dNTPs are increased by 50%, while 

10 the primer concentrations are decreased to 160pM each. 

(c) Detection of Microsatellite Products. 2/xl of 
PCR reaction are mixed with the same volume of f ormamide 
dye and run in a denaturing 7% acrylamide, 32% f orma- 
mide, 5.6 M urea, 13.5 mM Tris, 4.5 mM Boric Acid, 250 

15 /iM EDTA gel. 32 P labeled products are detected by auto- 

radiography. 



RESULTS AND DISCUSSION 

1. Cloning and sequence caracterization of bovine 
microsatellites: 

20 A library of Mbo l fragments between 250 and 500 bp 

was screened with the oligonucleotide probes. One out 
of 50 clones cross-hybridized. Assuming independent 
distribution of microsatellites and Mbo l sites, the 
frequency of (TG)^ microsatellites in the bovine genome 

25 is estimated to be at >100,000. Table 7 summarizes the 

sequence information available for about 230 such bovine 
microsatellites. For each of these, sufficient sequence 
information has been gathered to generate the required 
primers for PCR amplification of the corresponding 

30 microsatellite. All sequences were generated by 

sequencing as described above or by screening GENEBANK. 
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TABLE 7 

Bovine Micro satellites 



Sequence 


Sequence 


Identification 


Numbers 


Name 


Up 


Repeat 


Down 


AGLA13 


1 


2 


3 


AGLA17 


4 


5 


6 


AGLA206 


7 


8 


9 


AGIA209 


10 


11 


12 


AGLA212 


13 


14 


15 


AGLA215 


16 


17 


18 


AGLA217 


19 


20 


21 


AGLA218 


22 


23 


24 


AGLA22 


25 


26 


27 


AGLA220 


28 


29 


31) 


AGLA223 


31 


32 


J3 


AGLA226 


34 


35 


JO 


AGLA227B 


37 


*D O 

3o 




AGLA230 


40 


A 1 

41 


AO 


AGLA232 


43 


A A 

44 


A K 


AGLA233 


46 


47 


A Q 


AGLA234 


49 


jU 


*J J- 


AGLA243 


52 


53 


54 


AGLA247 


55 


56 


57 


AGLA248 


58 


59 


60 


AGIA254 


61 


62 


63 


AGLA255 


64 


65 


66 


AGLA257 


67 


68 


69 


AGIA258 


70 


71 


72 


AGLA259 


73 


74 


75 


AGLA260 


76 


77 


78 


AGLA267 


79 


80 


81 


AGLA2 69 


82 


83 


84 


AGLA272 


85 


86 


87 


AGLA280 


88 


89 


90 


AGLA284 


91 


92 


93 


AGLA285 


94 


95 


96 


AGLA29 


97 


98 


99 


AGLA291 


100 


101 


102 


AGLA293 


103 


104 


105 


AGLA296 


106 


107 


108 


AGLA298 


109 


110 


111 


AGLA299 


112 


113 


114 


AGLA300 


115 


116 


117 


AGLA33 


118 


119 


120 


AGLA8 


121 


122 


123 
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TABLE 7 (Continued) 



GBFSH 


124 


1 PS 


1 9 C 


GBIRBP 


127 


1 99. 




GBKCAS 


130 


1 ^1 




GBPRLGR 


133 




X JO 


MGTG1 


136 




J.JO 


MGTG11 


139 


140 


1 £1 
14J. 


MGTG13A 


142 


1 A** 


1 VIA 






1 A c 
±4 D 


14 / 






149 


150 


Vff /"•TV A T3 


151 


152 


153 




T tLA 


155 


156 


XuJjufiJLU 


i cn 

XO / 


158 


159 




160 


161 


162 




163 


164 


165 


TGIiAllU 


166 


167 


168 


TGLA111 


169 


170 


171 


TGLA112 


172 


173 


174 


TGLA116 


175 


176 


177 


TGLA117 


178 


179 


180 


TGIA12 


181 


182 


183 




1 OA 

lo4 


185 


186 


TGLA123 


187 


188 


189 


TGLA124 


190 


191 


192 


TGLA125 


193 


i94 


195 


TGLA126 


196 


197 


198 


TGLA127 


199 


200 


201 


TGLA128 


202 


203 


204 


TGLA13 


205 


206 


207 


TGLA130 


208 


209 


210 


TGLA131 


211 


212 


213 


TGLA132 


214 


215 


216 


TGLA134 


217 


218 


219 


TGIA135 


220 


221 


222 


TGLA137 


223 


224 


225 


TGIA141 


226 


227 


228 


TGLA142 


229 


230 


231 


XGLA147 


232 


233 


234 


TGLA149 


235 


236 


237 


TGLA15 


238 


239 


240 


TGLA153 


241 


242 


243 
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fTIH T>T "TP "7 

TABUS / 


\ LOnuinueaj 




TGLA154 


244 






TGLA158 


247 


248 


249 


TGLA159 


250 


251 


252 


TGLA160 


253 


254 


255 


TGLA162 


256 


257 


258 


TGIA164 


259 


260 


261 


TGLA17 


262 


263 


264 


TGLA170 


265 


266 


267 


TGLA171 


268 


269 


270 


TGLA172 


271 


272 


273 


TGLA175 


274 


275 


276 


TGLA176 


277 


278 


279 


TGLA179 


280 


281 


282 


TGIA182 


283 


284 


285 


TGLA188 


286 


287 


288 


TGLA189 


289 


290 


291 


TGLA2 


292 


293 


294 


TGLA20 


295 


296 


297 


TGLA203 


298 


299 


300 


TGLA206 


301 


302 


303 


TGLA208 


304 


305 


306 


TGLA210 


307 


308 


309 


TGLA213 


310 


311 


312 


TGLA214 


313 


314 


315 


TGLA215 


316 


317 


318 


TGLA22 


319 


320 


321 


TGLA222 


322 


323 


324 


TGIA226 


325 


326 


327 


TGLA227 


328 


329 


330 


TGLA23 


331 


332 


333 


TGLA231 


334 


335 


336 


TGLA245 


337 


338 


339 


TGLA25 


340 


341 


342 


TGLA254 


343 


344 


345 


TOLA255 


346 


347 


348 


TGLA257 


349 


350 


351 


TGLA26 


352 


353 


354 
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TABLE 7 (Continued) 



TGLA260 


355 


35fi 


3D I 


TOLA? fil 






*a cn 
ooU 




JOl 




J do 




O v*± 


JUJ 


O zr £* 
ODD 


luiiiu DO 


JO/ 


■DCQ 
JDO 


369 


AVxLuiZ / 


"37fi 
3 / U 


O /I 


372 


rpfT ROTO 


O / J 


J /4 


375 




""t7 £ 
-J / D 


O / / 


378 


TPT A "3 


*57 Q 




381 


luiinJUl 






oo4 


TGLA303 


385 


JOD 


JO f 


TGLA304 


388 


389 


390 


TGLA306 


391 


392 


393 


TGLA307 


394 


395 


396 


TGLA309 


397 


398 


399 


TGLA31 


400 


401 


402 


TGLA310 


403 


404 


405 


TGLA311 


406 


407 


408 


TGLA318 


409 


410 


411 


TGLA322 


412 


413 


414 


TGLA323 


415 


416 


417 


TGLA325 


418 


419 


420 


TGLA327 


421 


422 


423 


TGLA328 


424 


425 


426 


TGIA332 


427 


428 


429 


TGLA334 


430 


431 


432 


TGLA337 


433 


434 


435 


TGLA339 


436 


437 


438 


TGIA34 


439 


440 


441 


TGIA340 


442 


443 


444 


TGLA341 


445 


446 


447 


TGIA342 


448 


449 


450 
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TABIxE 7 (Continued) 



TGLA345 


A K1 


TGLA346 


A^A 


TGLA35 


AEil 


TGLA351 


A Cf\ 
4t>U 


TGLA353 


A C O 

463 


TGLA354 


A CC 

4oo 


TGLA357 


A CQ 

4©y 


TGLA36 


4 Id. 


TGLA37 


475 


TGLA377 


478 


TGLA378 


AQ1 

4ol 


TGIA380 


484 


TGLA381 


487 


TGLA382 


490 


TGLA387 


493 


TGLA39 


496 


TGLA394 


499 


TGLA4 


502 


TGLA40 


505 


TGLA400 


508 


TGIA414 


511 


TGLA415 


514 


TGIA417 


517 


TGLA419 


520 


TGIA420 


523 


TGIA421 


526 


TGIA423 


529 


TGIA424 


532 


TGLA427 


535 


TGLA429 


538 





453 


455 


456 




459 


4D1 


462 


AG. A 


465 


a en 


468 


Am 


471 


1 / .3 


474 


4 / O 


477 


VI "7 Q 

4 / y 


480 


JIOO 

*to^ 


483 


485 


486 


488 


489 


491 


492 


494 


495 


497 


498 


500 


501 


503 


504 


506 


507 


509 


510 


512 


513 


515 


516 


518 


519 


521 


522 


524 


525 


527 


528 


530 


531 


533 


534 


536 


537 


539 


540 



WO 92/13102 



PCT/US92/00340 



-103- 
TABLE 7 (Continued) 



TGLA431 


541 


542 


543 


TGLA432 


544 


545 


546 


TGLA433 


547 


548 


549 


TGLA435 


550 


551 


552 


TGLA436 


553 


554 


555 


TGLA437 


556 


557 




TGLA438 


559 


560 




TGLA44 


562 


563 


564 


TGLA441 


565 


566 


JO / 


TGLA443 


568 


569 


c-in 
D 1 U 


TGLA444 


571 


572 


^7 *5 
O to 


TGLA445 


574 


575 


^7^ 


TGLA446 


577 


578 


S7Q 


TGLA45 


580 


581 


582 


TGLA47 


583 


584 


585 


TGIA48 


586 


587 


588 


TGLA49 


589 


590 


591 


TGLA5 


592 


593 


594 


TGLA51 


595 


596 


597 


TGLA52 


598 


599 


600 


TGLA53 


601 


602 


603 


TGLA54 


604 


605 


606 


TGIA58 


607 


608 


609 


TGLA6 


610 


611 


612 


TGLA60A 


613 


614 


615 


TGLA60B 


616 


617 


618 


TGLA61 


619 


620 


621 


TGLA66A 


622 


623 


624 


TGLA67 


625 


626 


627 


TG1A68 


628 


629 


630 


TGLA69 


631 


632 


633 


TGLA70A 


634 


635 


636 


TGLA70B 


637 


638 


639 


TGIA72 


640 


641 


642 
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TABIiE 7 (Continued) 



TGIA73 


643 


644 


645 


TGLA75 


646 


647 


648 


TGIA76 


649 


650 


651 


TGIA77 


652 


653 


654 


TGIA78 


655 


656 


657 


TGIA79 


658 


659 


660 


TGLA8 


661 


662 


663 


TGLA80 


664 


665 


666 


TGLA82 


667 


668 


669 


TGLA84 


670 


671 


672 


XGIA85 


673 


674 


675 


TGIA86 


676 


677 


678 


TGIA89 


679 


680 


681 


TGIA9 


682 


683 


684 


TGIA94 


685 


686 


687 


TGLA98 


688 


689 


690 


TGLA99 


691 


692 


693 


TGLB84 


694 


695 


696 
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2. PCR- Amplification and Detection of 
Miorosatellites 

(a) Simplex amplif icatiQn 

Table 8 reports a preliminary list of bovine micro- 
5 satellite systems that were successfully amplified in 

vitro , with the corresponding primer pairs. Note that 
pairs of primers selected by "OPTIFRIM" , allow success- 
ful amplification in at least one of our standard con- 
ditions more than 80% of the time. Table 9 also gives 
10 the favoured annealing temperature (using the TECHNE MW2 

heating device) . The mean heterozygosity for the bovine 
microsatellites was estimated at «50%. 
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TABLE 8 

PGR Amplified Bovine Microsatellites 





Up 


Up 


Down 


Down 






Sequence 


Primer 


Printer 


Primer 


1^ TTIOT* 
tl JJUcJ- 






Name 


Name 


ID 


Name 




Annealing 


AGLA13 


AGLA13UP1 


697 


J\ (-"T 111 *aTYMTl 


D-70 


55 


60 


AGLA206 


AGLA206UP1 


699 


AiyhAZ V Ot>lM X 


/ uu 


55 


60 


AGLA209 


AGLA209UP1 


701 


AGXiAl U y UN X 


/ Md- 


55 


60 


AGLA215 


AGLA215UP1 


703 


AvjXiAZXDlJNX 


/ Ufl 


60 




AGLA217 


AGLA217UP1 


70o 


AfaXr AZ X /XIJXX 


/ UD 


55 


60 


AGLA22 


AGLA22UP1 


70 / 


tv r^r Boonwi 


/ VO 


60 




AGLA226 


AGLA226UP1 


i no 


3kf*T A90 CTUVTT. 

Z OUSH X 


710 


55 


60 


AGLA234 


AGLA234UP1 


/ xx 


jipt 7*9 atojI 


712 


55 


60 


AGLA254 


AGLA254UP1 


/ XJ 


7act 7*9 raiywi 

A\3±ir\£. D*±Uri X 


714 


55 


60 


AGIA255 


AGLA255UP1 


/Xo 


Tim 71 9 Rmiwi 


1 iu 


55 


60 


AGLA258 


AGLA258UF1 


"77 *7 
/X ' 


AurLuu »J O-L/IX X 


718 


55 


60 


AGLA2 60 


AGI*A260UP1 


"TT Q 

/ JLy 


acT a9£nmvn 
/iL7Ju/iz ouurtx 




55 


60 


AGLA269 


AGLA269UP1 


721 


7Ar*T 7i9 CQTMvTI 
AvjbAZ D J U« X 


799 


55 


60 


AGLA284 


AGLA284UP1 


723 


* r*T TV O O Jl TMvll 
AbliAZ O 4UN X 


79 A 


55 


60 


AGIA285 


AGLA285UP1 


725 


AGLA285DN1 




55 


60 


AGIA29 


AGLA29UP1 


727 


AGLAz 9DNX 


79« 


55 


60 


AGLA291 


AGLA291UP1 


729 


-m n-r TvOOl TMV.TT 

AGULd y XX>N X 




55 


60 


AGIA293 


AGIA2930P1 


731 




7"*9 


55 


60 


AGLA8 


AGLA8UP1 


•TOO 


TV r^T TV OTVlVTI 


7^A 


55 




GBF5H 






GBFSHDN1 


736 


55 




GBIKBP 


GBIRBPUP1 


737 


GBIRBPDN1 


738 


60 




GBKCAS 


GBKCASUP1 


739 


GBKCASDN2 


740 


60 




MGTG1 


MGTG1UP3 


741 


MGTG1DN1 


742 


55 


60 


MGTG13B 


MGTG13BUP3 


743 


MGTG13BDN2 


744 


55 


60 


MGTG3 


MGTG3UP1 


745 


MGTG3DN2 


746 


55 


60 


MGTG4B 


MGTG4BUP2 


747 


MGTG4BDN2 


748 


55 


60 


MGTG7 


MGTG7DP3 


749 


MGTG7DN3 


750 


55 


60 


TGLA10 


TGLA10UP1 


751 


TGLA10DN1 


752 


60 




TGLA111 


TGLA111OT1 


753 


TGLAlllDNl 


754 


60 




TGLA116 


TGIA116UP1 


755 


TGLA116DN1 


756 


60 




TGLA117 


TGLA117UP1 


757 


TGIA117DN1 


758 


60 




TGLA12 


TGLA12UP1 


759 


TGLA12DN2 


760 


55 




TGLA122 


TGLA122UP1 


761 


TGLA122DN1 


762 


60 




TGLA123 


TGLA123UP1 


763 


TGLA123DN1 


764 


60 




TGLA124 


TGLA124UP1 


765 


TGLA124DN1 


766 


60 




TGLA125 


TGLA125UP2 


767 


TGLA125DN2 


768 


55 


60 


TGIA126 


TGIA126UP1 


769 


TGLA126DN1 


770 


55 




TGIA127 


TGLA127UP1 


771 


TGLA127DN1 


772 


55 




TCLA128 


TGLA128UP1 


773 


TGLA128DN1 


774 


60 




TGLA130 


TGLA130UP1 


775 


TGIA130DN1 


776 


55 





WO 92/13102 



PCT/US92/00340 



-107- 
■y^pT.TC « (Continued) 



TGLA132 


TGLA132UP1 


777 


TGLAl 3 4 


TGLAl 


779 


J. UllxlX J / 


TGTA137ITP1 


7R1 


TGLAl 42 


tc;t.ai 4?tjpi 

JL \3XfcrVX " *■* \JZ- -L 


7R^ 




TCiT.AI 47TTP1 

JL OJU^iX *x i\JC± 


/ Oi> 


XbXtflXO 


X bXiAJL D Ulr Z 


/ o / 


-LoJU/iJLO J 


JL bXiA. X O O Ulr Z 




r pr , T ai rq 


IbLAXDoUrX 


TOT 


TGLAl O 9 


TGLAl b 9 UP 1 


793 


TGLAl o4 


TGLAl 64UP1 


i ft n 

795 


TGLAl 70 


TGLA170UP1 


797 


TGLAl 7 6 


TGLAl 7 6UP 1 


799 




TGLAl a2UP 1 


801 


lbiiAZU O 


TGLAZUJUPl 


803 


TGLA206 


TGLA206UP1 


805 


TGLA210 


TGLA210UP1 


807 


TGLA214 


TGLA214UP1 


809 


TGLA215 


TGLA215DP1 


811 


XbI*A£.£ 


TGLA22UP1 


813 


XbX*n£Z 1 


TGLAZ27UP1 


815 


nyrr »oo 
xbJUAZ J 


TGLA23UP1 


817 




XbJjAz31UPl 


819 




TGItA24oUPl 


821 


XOlt/i^ QU 


IbLAZbUUrl 


823 


TGLA2fi*3 


TftT 219 fi**TTOl 


oZD 


TGLA28 


TGLA28UP3 


827 


TGLA303 


TGLA303UP1 


829 


TGLA304 


TGLA304UP1 - 


831 


TGLA307 


TGLA307UP1 


833 


TGLA309 


TGLA309UP1 


835 


TGLA322 


TGIA322UP1 


837 


TGLA325 


TGLA325OT1 


839 


TGLA327 


TGLA327UP1 


841 


TGLA328 


TGLA328UP1 


843 


TGIA334 


TGIA334UP1 


845 


TGLA337 


TGIA337UP1 


847 


TGLA339 


TGLA339UP1 


849 


TGLA34 


TGLA34UP2 


851 


TGLA340 


TGIA340UP1 


853 


TGLA341 


TGLA341UP1 


855 


TGLA342 


TGLA342UP1 


857 


TGLA346 


TGLA346UP1 


859 


TGLA35 


TGLA35UP1 


861 



TGLAl 32DN1 


778 


55 




TGLAl 34 DN1 


780 


55 


60 


TGLA137DN1 


782 


fin 




TGLA142DN1 


784 


60 




TGLA147DN1 




fin 




TKIAl SDN? 


7RR 




fin 


TGT.A1 ^^HN"? 


7QH 


fin 




XoxjiUL D OUVt X 


TOO 

/ y£ 


bu 




mpT TV n cQnui 

x oXiAj. o yuw x 




00 


bU 


TGLAl D4DN1 


1 ft£* 

796 


55 


60 


TGLA170DN1 


798 


60 




TGLAl / 6DN1 


800 


60 




XLrXiAXoZXJNX 


\S\Jdt 


60 




tnpr SOmTlVtl 

IbLAzU^DNX 


Qf\A 

oU4 


bU 




TGLA206DN1 


806 


60 




TGLA210DN1 


808 


60 




TGLA214DN1 


810 


55 


60 


TGIA215DN1 


812 


55 


60 


TGLA22DN1 


814 


60 




TGLA227DN1 


816 


55 


60 


TGLA23DN1 


818 


60 




TGLA231DN1 


820 


55 


60 


TGLA245DN1 


822 


55 


60 


TGLA260DN1 


824 


55 


60 


XbXiA^: O JDN1 


826 


55 


60 


TGLA28DN2 


828 


55 


60 


TGLA303DN1 


830 


60 




TGLA304DN1 


832 


60 




TGLA307DN1 


834 


55 


60 


TGLA309DN1 


836 


55 


60 


TGLA322DN1 


838 


55 


60 


TGLA325DN1 


840 


55 


60 


TGLA327DN1 


842 


60 




TGLA328DN1 


844 


55 


60 


TGLA334DN1 


846 


55 


60 


TGLA337DN1 


848 


60 


TGLA339DN1 


850 


55 




TGLA34DN1 


852 


60 




TGIA340DN1 


854 


55 


60 


TGLA341DN1 


856 


60 




TGIA342DN1 


858 


60 




TGIA346DN1 


860 


60 




TGLA35DN1 


862 


60 
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TABLE 8 (Continued) 



TGIA351 


rr»/-»T 7\ O t: "I ttt> *| 


DDO 


TGTA351DN1 


864 


55 


60 


TGLA353 




ODJ 


J. ^TI III 1? O .J) JL/it X 


866 


60 




TGLA354 


mrtT BlC J TIT! 1 

TGIA354UP1 


DO / 


Tf2T 2k *3 K4TWTI 


Rfift 

ODD 


-J 


60 


TGIA357 


TGIA357UP1 


ooy 


TfZT A *k ^TTIWI 


R70 




6ft 


TGLA36 


TGLA36UP1 


0*71 
D IX 


J. toJirw DlJM i 


R79 






TGLA37 


TGIA37DP1 


cm 


/ IJVi X 


R74 


60 




TGLA377 


TGLAo / /Ui?l 


on c: 
D/3 


■p^T.A*^77 , nwi 


876 


55 


60 


TGIA378 


mt~*T n *a "7 ottt> i 
TGLAo /oUxrl 


R77 
Off 


XUXinJ / \J-LfVi X 


878 


60 




TGLA382 


TGLA3oZUPl 






880 


60 




TGLA387 


mf*T n o on in 1 
TGIAoo /UPJL 




TGT.A^R7DN1 


882 


60 




TGLA40 


n»f*T 7\ ^ fYTTO *1 

1 uJjii4 UUir JL 


DO J 




884 


60 




TGLA415 


fP^*T A /I 1 ETTTO 1 

TGI*A4 X D Uir 1 


PAR 


JL VJJLtttr* JL *JUUi JL 


886 


55 


60 


TGLA420 


TGJjA4 Z U Vic L 


pq7 
00 / 


TCTA420DM1 

JL wJXTl A Vi/Vf JL 


888 


55 


60 


TGLA421 


rppT A JITl TTD 1 

TtrXiA,4ZlLHrX 






890 


55 


60 




TGTA423UP1 


891 


TGLA423DN1 


892 


55 


60 


TGIA431 


TGLA431OT1 


893 


TGLA431DN1 


894 


55 


60 


TGLA433 


TGLA4330P1 


895 


TGLM33DN1 


896 


60 




TGLA435 


TGLA435UP1 


897 


TGLA435DN1 


898 


55 


60 


TGLA44 


TGLA44UP2 


899 


TGLA44DN1 


900 


55 


60 


TGIA441 


TGLA441UP1 


901 


TGIA441DN1 


902 


60 


60 


TGLA444 


TGLA4440P1 


903 


TGLA444DN1 


904 


55 


TGLA45 


TGLA45UP1 


905 


TGLA45DN1 


906 


60 




TGLA47 


TGLA47UP1 


907 


TGLA47DN1 


908 


55 


60 


TGLA48 


TGL&48UP1 


909 


TGLA48DN1 


910 


55 


60 


TGLA49 


TGIA490P1 


911 


TGLA49DN2 


912 


55 




TGLA51 


TGLA51UP1 


913 


TGLA51DN1 


914 


60 




TGLA52 


TGLA52UP1 


915 


TGLA52DN1 


916 


55 




TGIA53 


TGLA53UP1 


917 


TGLA53DN1 


918 


55 


60 


TGLA58 


TGLA58UP1 


919 


TGIA58DN1 


920 


55 


60 


TGLA6 


TGIA6DP1 


921 


TGLA6DN1 


922 


60 




TGIA60A 


TGLA60ADP1 


923 


TGLA60ADN1 


924 


55 




TGLA60B 


TGLA60BUP1 


925 


TGLA60BDN1 


926 


55 




TGIA61 


TGIA61UP1 


927 


TGIA61DN1 


928 


55 


60 


TGIA67 


TGIA67UP1 


929 


TGLA67DN1 


930 


60 
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T&RT.T: a (continued) 



TGLA68 


TGLA68UP1 


931 


TGLA68DN1 


932 


60 




TGLA72 


TGLA72UP1 


933 


TGLA72DN1 


934 


55 


60 


TGLA73 


TGLA73UP1 


935 


TGLA73DN1 


936 


55 




TGIA75 


TGIA75UP1 


937 


TGLA75DN1 


938 


55 




TGLA76 


TGLA76UP1 


939 


TGLA76DN1 


940 


55 




TGLA77 


TGLA77UP1 


941 


TGLA77DN1 


942 


55 


60 


TGLA80 


TGLA80UP1 


943 


TGLA80DN1 


944 


60 




TGLA82 


TGLA82UP1 


945 


TGLA82DN1 


946 


55 




TGLA86 


TGLA86UP1 


947 


TGLA86DN1 


948 


55 


60 


TGLA89 


TGLA89UP1 


949 


TGLA89DN1 


950 


52 
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(b) Multiplex Amplification 

To increase the speed and lower the cost of geno- 
typing, multiplex approaches for both amplification and 
data capture of micros at el lites are utilized* Micro- 
5 satellite systems yielding products of non-overlapping 

size were coamplified as described above. Preliminary 
results show that at least four different systems can 
easily be coamplified in these standard conditions. The 
following multiplex amplifications, for instance, were 
10 shown to yield consistent, easily interpretable results: 

a- GBCYP21 - TGLA10 - TGLA 44 - TGLA116 
b- TGLA9 - MGTG4B - TGLA23 - TGIA35 
C. MGTG3 - MGTG13B 

By limiting detection to a single detection procedure 
15 (autoradiography of 32 P-labeled product) , multiplex 

amplification is limited to systems yielding products of 
non-overlapping size. To overcome this limitation, 
alternative detection schemes are utilized. In particu- 
lar, the use of confocal microscopy to detect products 
20 labeled with laser-excitable fluorescent molecules (such 

as fluoresceine, rhodamine, . ..) is used. The products 
can then be differentiated based on the specific excita- 
tion and emission spectra of the tagged fluorescent 
molecules. Using this approach detection of up to at 
25 least 20 different systems should is feasable. 

3, PCR-mapping of Bovine Micro-satellites Using 
Somatic cell Hybrids 

Results of the concordancy analysis are summarized 
in Table 9. Synteny groups to which microsatellite 
30 systems most likely map as deduced from concordancy 

analysis are underlined. Clear-cut results were obtained 
for MGTG13B (U19 or chromosome 15) , TGIA6 (Ull) , TGIA9 
(U27) , TGIA11 (U16) , TGIA22 (U26 or chromosome 26) , 
TGIA23 (Ull) , TGIA36 (U27) , TGIA52 (U9 or chromosome 
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18) . Results are less discriminating for the other 
systems. Most likely synteny groups are, however, 
mentioned. In addition, we know from the literature 
that GBKCAS maps to U15 or chromosome 6, and GBCYP21 to 
5 U20 or chromosome 23. 
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EXAMPLE 4 

COH8TRUCTION OF A PRIMARY BOVTHE DMA MARKER MAP. 

Bovine pedigrees for a total of approximately 200 
5 individuals were genotyped for 150 of these markers as 

described. Pair-wise linkage analysis was performed 
using the LODSCORE program. Only lodscore values 
superior to 3 were considered significant • This 
generated a primary DNA marker map with 24 linkage 
10 groups counting two or more markers (15 assigned to 

specific chromosomes or synteny groups) , and 68 
singleton markers* Table 10 summarizes our findings. 
Linkage groups were assigned to specific chromosomes or 
synteny groups whenever that information was available* 

15 

TABLE 10 
Primary Bovine DNA Marker Map 

CHR./SYNT. LG MARKER 

20 Chr. 2 1 GMBT28 

1 GMBT47 

1 GMBT61 

1 MSBT13 

1 TGLA11 

25 1 TGLA44 

1 TGLA61 

1 TGLA215 

1 TGLA159 

1 TGLA58 

30 1 TGLA60 

1 TGLA377 

1 MGTG4B 

2 TGLA116 
2 Weaver 



35 



Chr.6 GBKCAS 
Chr.8 GMBT17 
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CHR./SYNT. 
Chr.10 
5 Chr.14 

Chr.15 



10 

Chr.19 
Chr.21 

15 



Chr.23 

20 



25 



30 



Chr-24 

35 
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TABftE 10 (Continued) 





MARKER 




6HBT19 




Thyroglobul in 




GMBT6 


3 


MGTG13B 


3 


MSBT35 


4 


GBFSH 


4 


TGLA75 




^ GH 




GMBT22 


5 


GMBT15 


5 


GMBT16 


5 


GMBT39 


5 


MSBT29 


5 


TGLA122 


5 


TGLA337 


6 


GMBT12 


6 


MSBT43 


6 


MSBT70 


6 


Prolactin 


6 


BoLA 


6 


GBCYP21 


6 


AGIA291 


6 


MGTG7 


6 


TGLA142 


7 


GMBT41 


7 


MSBT6 


7 


AGLA29 


7 


TGLA126 


7 


TGLA153 


7 


TGLA214 


8 


GMBT5 


8 


MSBT33 
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TABLE 10 (Continued) 





pnp /CVMT 


LG 


MARKER 




unr* £o 




GMBT11 


5 






TGLA22 








TGIA51 




r«h-v v 




GMBT27 
















TGTA124 


10 






















X\9XJCXO J 




LJIX • x 




DYZ1 




Ul 


Q 


GMRT42 


15 




q 


HGTG13A 






Q 






TT1 f\ 

U1U 


1 0 


GMBT7 






10 


GMBT26 






10 


HSBT122 






11 


TGIA52 






11 


TGIA57 






11 


TGLA415 




TT1 1 


12 


TGIA6 






12 


TGLA23 




U16 




TGLA5 








GMBT1 






13 


GMBT53 






13 


TGIA206 




U22 




GMBT14 


30 






GMBT8 




U23 




GMBT36 






14 


TGIA9 






14 


TGLA36 






15 


GMBT3 


35 




15 


GMBT29 






15 


GMBT49 
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TART r K j (} (continued) 



5 



10 



20 



25 



CHR./SYNT. 






MARKER 


7 




16 


GMBT21 






16 


GMBT33 


7 




17 


GMBT24 






17 


MSBT15 






17 


TGIA164 






17 


TGLA48 






17 


TGLA303 


7 




18 


GMBT18 






18 


AGLA254 






19 


AGLA226 






19 


TGLA28 


7 




20 


MGTG1 






20 


TGLA245 


? 




21 


MSBT11 






21 


MSBT19 






21 


TGIA227 






22 


TGLA378 






22 


TGLA433 


7 




23 


TGLA51 






23 


TGLA94 


7 




24 


TGIA54 






24 


TGIA68 



CHR./SYNT. : Chromosome or synteny group 
LG: Linkage group 

CONCUJSIQNS 

Samples of £^ coli harboring clones of polymorphic 
bovid markers have been deposited on 17 January 1991 
with the American Type Culture Collection (Rockville, 
Maryland) under accession numbers 68,514 and 68 r 515. 
Deposit of the clones is for the sake of completeness, 
but is not intended to limit the scope of the instant 
invention to said deposited materials. Access to said 
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cultures will be available during the pendency of the 
application to those determined by the Commissioner of 
Patents and Trademarks to be entitled thereto. All 
restrictions on availability will be removed upon grant 
5 of the application and said cultures will remain avail- 

able during the life of the patent. Nonviable or 
destroyed cultures will be replaced in kind. 

It will be appreciated that the methods and com- 
positions of the instant invention can be incorporated 

10 in the form of a variety of embodiments, only a few of 

which are disclosed herein. It will be apparent to the 
artisan that other embodiments exist and do not depart 
from the spirit of the invention. Thus the described 
embodiments are illustrative and should not be construed 

15 as restrictive. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 307: 

TTTCACAGTTGTCTAAATAAGAGAGTTATAATCACCCCACCCCCAGGTCA 
TGGTCTAGTGCTCTTCTTCCAGAAAAATCCAATCTAAGCATTTGGGTGAA 
GGGGGTCTGGCTGAAGACAACAGGA 

(2) INFORMATION FOR SEQ ID NO: 308: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TOTE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Bos taurus 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: TGLA210 (repeat) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308: 
TGTGTGTGTGTGTGTGTGTGTGTGTGT 
(2) INFORMATION FOR SEQ ID NO: 309: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Bos taurus 

* % (vii) IMMEDIATE SOURCE: 

(B) CLONE: TGLA210 (down) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 309: 

TCTAGAGGATTCCCTGGAGAAGAGAATGGCAACCCACTCCAGTATTCTTG 
CCTGGGAAATCCCATGGACAGAGGAGCCTGGTGGGTTATAGTCCATGAGG 
TTGCAAAGAGTCAGACAGGACTGAATGACTAAT 

(2) INFORMATION FOR SEQ ID NO: 310: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 183 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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WHAT IS CLAIMED IS: 

1. A set of nucleic acid fragments that hybridize to 
polymorphic loci in bovids, wherein said set com- 
5 prises fragments that hybridize to at least five 

unique loci and each fragment hybridizes to a locus 
comprising at least two alleles and with a hetero- 
zygosity of at least 50%. 

10 2. The set of claim 1, wherein said polymorphic loci 

are selected from the group consisting of VNTR 
loci, multisite haplotype loci, microsatellite loci 
and combinations thereof. 

15 3. The set of claim 2, wherein said polymorphic loci 

are VNTR loci. 

4. The set of claim 3, wherein said fragments are 
selected from the group of VNTR markers identified 

20 in Table 1. 

5. The set of claim 2, wherein said polymorphic loci 
are multisite haplotype loci. 

25 6. The set of claim 5, wherein said fragments are 

selected from the group of multisite haplotype 
markers identified in Table 5. 

7. The set of claim 2 f wherein said polymorphic loci 
30 are microsatellite loci. 

8. The set of claim 7, wherein said fragments are 
selected from the group of microsatellite markers 
identified in Table 7. 
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9. The set of claim 1, wherein said bovids are 
bovines . 
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10. The set of claim 9, wherein said bovines are of the 
species Bos taurus . 

11. The set of claim 1, wherein said bovids are ovines. 

5 

12. The set of claim 1, wherein said fragments 
obtained from a bovid genome. 

13. The set of claim 12, wherein said bovid is a 
10 bovine. 

14. The set of claim 13, wherein said bovine is of the 
species Bos t aurus . 
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are 



15. The set of claim 2, wherein said fragments are 
selected from the group of VNTR markers identified 
in Table 1, the group of multisite haplotype 
markers identified in Table 5, the group of micro- 
satellite markers identified in Table 7, and 

20 combinations thereof. 

16. The set of claim 7, wherein said fragments are 
selected from the group of microsatellite markers 
identified in Table 8. 



17. A synteny map of microsatellite markers identified 
in Table 9. 



18. A synteny map of VNTR markers identified in Table 
30 4. 

19. The microsatellite marker TGLA116 for the Weaver 
condition. 



20. The microsatellite marker TGLA116 for the QTL trait 
of enhanced milk production in Brown Swiss cattle. 
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21. A set of nucleic acid fragments comprising at least 
one fragment selected from the group consisting of 
the VNTR markers identified in Table 1, the multi- 
site haplotype markers identified in Table 5, and 

5 the microsatellite markers identified in Table 7. 

22. A process for mapping quantitative traits in bovids 
which comprises using the set of claim 21. 

10 23. A process for genetic identification using the set 

of claim 21. 

24 . A process for introducing a desired gene into a 
bovid which comprises using the set of claim 21. 
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25. The process of claim 21, which further comprises 
the use of velogenesis. 
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